Mutex vs. binary semaphore
Mutices and semaphores are among the most basic tools in multithreaded programming. However most people I asked do not know what is the difference between them. So please let me introduce you to them.
Consider a resource which is shared between multiple threads. For example a container. You don’t want to have multiple threads modifying the container simultaneously, or one thread modifying the container while other threads are reading from it, otherwise you will end up with classical race conditions and unpredictable things will happen.
To guard a resource from other threads while you are accessing it, you use a synchronization primitive, which you conceptually associate with the guarded resource. Threads can lock the synchronization primitive when they need to access the resource. Then they can release the primitive after they are finished with accessing the resource. When the synchronization primitive is already locked, a thread trying to lock it will wait/stall until the other thread who locked it – unlocks it.
At the first glance, both mutex and binary semaphore fit the description of the above synchronization primitive. Well, not quite. Using binary semaphore in place of a mutex is a bad idea.
Conceptually a semaphore is like an integer. You can increment it and you can decrement it. If the semaphore’s value is 0, the thread trying to decrement it will wait/stall until somebody else increments it. This way, the semaphore never has a negative value.
A binary semaphore is just a semaphore capped at one, i.e. it’s value cannot exceed one. You can treat the decrement operation as “lock” and the increment operation as “unlock”.
The problem with the semaphore is that any thread can increment it or decrement it. In particular, if the semaphore’s value is 0 (“locked”), another thread can increment it (“unlock”), even if this is not the thread which locked it! It takes more discipline to write code which correctly uses binary semaphores for locking and there is still a potential for error.
Another problem is that most semaphore implementations allow sharing semaphores between processes. This makes them much heavier than e.g. POSIX mutices or critical sections on Windows, which are lightweight, because they only work within one process and don’t require calling into kernel space in most cases.
Unlike binary semaphores,mutices may also have another interesting property, depending on an implementation, i.e. they can be recursive. A recursive mutex can be locked twice from the same thread. This allows you to write an accessor function and not have to care whether the mutex is already locked by the current thread or not. However it is generally not recommended to use recursive mutices, the necessity for recursive mutices is a sign of a bad design and indicates that there may be potential problems with the interfaces or even hidden multithreading bugs.
In general mutices (or critical sections on Windows) are typically recommended over binary semaphores as synchronization primitives between multiple threads in the same process.