Author image

Thread Pool


Difficulty:
3/5


A thread pool is a software design pattern that helps programmers achieve concurrency in an optimal way. By maintaining a pool of threads ready to execute at any moment performance is increased and latency is minimized by keeping a set of threads always resident in a pool; at the ready, instead of frequently starting them up and destroying them because of frequent brief tasks. Thus we avoid short lived threads which break concurrency speedup.

All this because creating and destroying threads takes a longer time than keeping them constantly alive and waiting.

Design

  • there is an array of threads - m_pool
  • there is a FIFO queue of tasks (a Task should be a wrapper for a Callable object) - m_tasks. Tasks are enqueue()ed into the task queue
  • all threads are started up front in the pool and they never quit running; if there's no task in the queue they wait/sleep.
  • every incoming Task notifys() (using a condition variable) a single thread to handle it.
  • if a Task returns a result the client can send a query for its value later
  • the ThreadPool has a switch On/Off flag to indicate whether it's running
  • Thread synchronization: a condition_variable along with a mutex variable perform thread synchronization - m_cond, m_mu
  • m_mu is used to synchronize access to shared data ie. the 2 containers
  • m_cond notifies threads upon task arrival and puts them to sleep when there are no tasks to be performed.

Reasoning

The key implementation question I asked myself is “how can one create a container of callable objects”? In general a container of generic objects, in order to accommodate functions of any signature. std::any came up to my mind, but I thought it would possibly too much, making the entire class a template that is. So I questioned whether I can circumvent this somehow.

And I thought of a bare wrapper std::function<void()> could possibly work, as the type of the task queue, as almost all c++ callables can be converted to an std::function<T>. But this would be only an indirection, inside this std::function I would need to pass up the arguments up front (however many the client supplies). Alternatively I could use functionRef.

Then I thought that I also want to provide return values to the client. The only way to do this in C++ in an generic way I know of is std::future. And then I realized I need another level of indirection, an std::packaged_task which will wrap the std::function which will wrap the call to our target (member) function. I will get the return value upfront from packaged_task's future and return it to the caller. And finally I must make sure the packaged_task lives long enough for it to return the value, ie. dynamic allocation through a std::unique_ptr or std::shared_ptr would be suitable. I decided on a std::shared_ptr as std::unique_ptr was unfortunately not enough (and to be totally honest I'm not sure why, if you do check on the code, leave me a comment).

After sweat & blood and tears spilled I'm proud of how it came to be.

Implementation Details

In enqueue you can see how I'm std::moveing parameters to the capture list, instead of capturing by reference. This is done because threads may outlive the scope where they are created. In such case, any local variable captured by reference may be destroyed while the thread is still running.

I used Windows x86_64, Visual Studio and C++17 to build the project. The code used is cross platform C++.

Github

Github repository link.

Acknowledgements

Thread pool Wikipedia link is (surprisingly?) succinct & well written.


0 likes