Home > Computing > Mutex vs. binary semaphore

Mutex vs. binary semaphore

Mutices and semaphores are among the most basic tools in multithreaded programming. However most people I asked do not know what is the difference between them. So please let me introduce you to them.

Consider a resource which is shared between multiple threads. For example a container. You don’t want to have multiple threads modifying the container simultaneously, or one thread modifying the container while other threads are reading from it, otherwise you will end up with classical race conditions and unpredictable things will happen.

To guard a resource from other threads while you are accessing it, you use a synchronization primitive, which you conceptually associate with the guarded resource. Threads can lock the synchronization primitive when they need to access the resource. Then they can release the primitive after they are finished with accessing the resource. When the synchronization primitive is already locked, a thread trying to lock it will wait/stall until the other thread who locked it – unlocks it.

At the first glance, both mutex and binary semaphore fit the description of the above synchronization primitive. Well, not quite. Using binary semaphore in place of a mutex is a bad idea.

Conceptually a semaphore is like an integer. You can increment it and you can decrement it. If the semaphore’s value is 0, the thread trying to decrement it will wait/stall until somebody else increments it. This way, the semaphore never has a negative value.

A binary semaphore is just a semaphore capped at one, i.e. it’s value cannot exceed one. You can treat the decrement operation as “lock” and the increment operation as “unlock”.

The problem with the semaphore is that any thread can increment it or decrement it. In particular, if the semaphore’s value is 0 (“locked”), another thread can increment it (“unlock”), even if this is not the thread which locked it! It takes more discipline to write code which correctly uses binary semaphores for locking and there is still a potential for error.

Another problem is that most semaphore implementations allow sharing semaphores between processes. This makes them much heavier than e.g. POSIX mutices or critical sections on Windows, which are lightweight, because they only work within one process and don’t require calling into kernel space in most cases.

Unlike binary semaphores,mutices may also have another interesting property, depending on an implementation, i.e. they can be recursive. A recursive mutex can be locked twice from the same thread. This allows you to write an accessor function and not have to care whether the mutex is already locked by the current thread or not. However it is generally not recommended to use recursive mutices, the necessity for recursive mutices is a sign of a bad design and indicates that there may be potential problems with the interfaces or even hidden multithreading bugs.

In general mutices (or critical sections on Windows) are typically recommended over binary semaphores as synchronization primitives between multiple threads in the same process.

Categories: Computing
  1. Piotr Holonowicz
    29.05.2013 at 00:20

    There is also one, much more nasty problem with the semaphores (both binary and counting). It is the priority inversion – a semaphore locked by a low priority process can block the whole OS if the higher priority processes need to wait for unlocking/releasing the semaphore. This condition happens though only for OS that uses fixed priority preemptive scheduling – most RTOS for embedded devices are prone.

  2. SOL
    19.08.2013 at 12:10

    Threading it’s an old school idea. Dogma which follows us every day in programming. Better is to avoid using mutex or anything like that. The best do not use threads at all – http://vimeo.com/71278954

  3. Anshuman Manral
    3.03.2014 at 07:24

    The problem with the semaphore is that any thread can increment it or decrement it. In particular, if the semaphore’s value is 0 (“locked”), another thread can increment it (“unlock”), even if this is not the thread which locked it!
    – Is it not that the same property of (binary) semaphores makes them more suitable for synchronization rather than protection? So for example, I want to share a task between two threads, for example task ‘a’ writes into a queue from which task ‘b’ must read. Here task ‘b’ waits on the binary semaphore before it can read from the queue. Whenever task ‘a’ writes into the queue it unlocks the semaphore thus ‘indicating’ to task ‘b’ that there is some data available on the queue for it to read.

  4. Tom
    27.10.2014 at 14:33

    Semaphores are more widely used for task-task sync or task-ISR sync, while it is better to use mutex for mutually exclusive access to resources. And task synchronization is not same thing as resource access.

  5. bhupesh
  6. theleancoder
    14.10.2015 at 23:42

    Take a look at my efforts on explaining both of these concepts in a easy to understand way at https://theleancoder.wordpress.com/2015/10/14/diving-deep-with-semaphore-and-mutex/

  1. 16.10.2015 at 07:46
  2. 30.12.2015 at 18:18

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: