Observations
Without actually timing things these are just speculations ...
It looks like you're basically using deferred
as the strategy when you think that the pool is oversubscribed, which means that the task will be executed when someone calls get()
on the future. I am not sure if this is really an improvement over letting the OS handle the oversubscription.
For example if there is one thread setting off a whole bunch of smaller tasks, these will be scheduled async until the limit is hit, then they'll get deferred. As per cppreference (http://en.cppreference.com/w/cpp/thread/async) those will get executed on the first wait()
or get()
call on the future, so once the main thread wants those results it'll call the futures, which means the deferred
ones will be called single threaded from the main thread. Or the futures have to be passed to other threads for execution, which doesn't seem very convenient.
num_threads
is a global, if dj:async
is only used from a small number of threads that might be ok, if it's used from a larger number of threads the cost of having to share this value between cores might have an impact on performance.
If you are executing a loop around compare and exchange you might as well think about using weak
rather than strong.