Multithreading in scripting languages
Scripting languages and multithreading don’t go together. Sure, many scripting languages implement something which looks and works like threads, but in the end these are usually fake threads, which work like a single CPU thread and run on a single CPU core. Fake threads are of course useful, but they cannot leverage the full power of a modern CPU. Or GPU.
Today CPUs have many cores and GPUs have hundreds of little “hardware threads”. Typically a GPU has multiple cores, each of which can execute multiple threads simultaneously.
What makes many scripting languages useful is dynamic typing. Variables don’t have predetermined types, the types of objects they reference can change in time. But from CPU’s point of view objects in scripting languages are complex, it is difficult to make them modifiable atomically with regards to other threads.
However, it would be great to use a general purpose scripting language and be able to leverage multiple CPU cores as well as GPUs. Scripting languages are easy to use – this is their purpose – so they are perfect for simple tasks or prototyping.
There are three general approaches to creating a general-purpose scripting language with support for robust multithreading:
1. Making the interpreter/engine of a scripting language thread-safe.
This is certainly doable, but in effect it would make the interpreter very slow. Python has been criticized for years for being slow even without real multithreading support.
2. Designing a new scripting language specifically for multithreading.
This is an interesting option. Imperative and functional languages have a bigger potential for this approach. It is very difficult to design a new programming language which will be easy to use, though.
3. Modifying an existing language and applying some limitations for multithreaded operation.
This is something inbetween the other two options. This is the most promising option. The idea is to not sacrifice too much of a language’s flexibility but still provide robust multithreading support.
All three approaches have to take into account several common limitations.
The most common classes/kinds of variables on which functions operate are: locals, arguments, closures and globals.
Local variables which are not closures to inner functions are always thread-safe, because they are local to the executing function’s context.
This can be complicated by coroutines, functions which preserve their state across multiple calls. Local variables in a coroutine are no longer thread-safe if two threads want to call the coroutine simultaneously.
Function arguments are something between closures and locals from the programmer’s perspective. An object passed from the caller to the callee is accessible to both. If the callee is used as a thread function, the arguments are shared between two threads.
Closures share some similarities with function arguments, although they are in a way local variables to the function they belong to. Multiple threads may get spawned using a local function which has access to the closures. This way multiple threads may be trying to modify the closures.
Globals are the most susceptible to manipulation from multiple threads. Globals are not recommended in general, although they are typically more pervasive in scripting languages, esp. in languages where functions are first class objects and share the global namespace with other objects.
Immutability (or constness) is a very useful trait for variables. Immutable variables, or rather their immutable values, are thread-safe, because no thread can modify them, so all threads are safe to read them simultaneously. Therefore immutability is the ultimate key to thread safety. Unfortunately most scripting languages do not support explicit immutability. It is difficult to impose and control immutability in a dynamically typed language. Immutability would become another source of unexpected exceptions.
An example possible solution based on option 3 from the above list would be to leverage an existing language model but limit data exchange between threads. If threads cannot exchange data beyond special facilities like mailboxes, the risk of deadlocks or data races is low. It would work as follows.
All objects would have an additional binary state attribute, they would either be unique or shared.
A unique object is an object which is only accessible to the current thread and the thread can do anything with the object.
A shared object is an object which is accessible to multiple threads. A shared object is immutable and cannot be modified by any thread.
And here is how this new state would be used:
- When a program or script is started, all objects are marked as unique or shared or whichever is more convenient in the long run. In languages which have explicit immutability, immutable objects would be marked as shared. Otherwise all new objects are marked as unique.
- When a new thread is spawned, first all objects accessible to this thread are marked as shared. This is a deep operation and applies to all children of the global object and their children and so on. It also applies to the locals of the current function and all closures accessible to the current thread, as well as all saved coroutine state (continuations). Both the parent and the child thread proceed as usual once the child thread is spawned.
- Because shared objects are immutable, whenever a thread attempts to modify a shared object, a shallow copy of the object being modified is created, and that shallow copy is modified. The new copy is marked as unique.
There are two problems with the above approach. First of all spawning a new thread is quite expensive, because it requires walking all data visible to the parent thread and marking it as shared.
Secondly, the operation of shallow-copying shared objects to make them unique (mutable) is non-trivial, because it would require modifying all other objects which reference them as well in the same manner.
This second problem could be avoided if objects were always referenced by handles and the actual object pointers would be stored in a big array per-thread to which the handle would be an index. Only the object under the handle would be copied and made unique. Other threads wouldn’t notice this change.
The handle/array approach would also reduce the impact of the first problem, because instead of walking a tree of objects the interpreter would only have to walk a linear array.
The handle/array approach would have a negative impact on performance of modern engines where raw pointers are used internally to reference objects.
Another approach would be to use thread identifier to indicate which thread owns a particular object. If the thread identifier stored in a particular object would be equal to current thread’s identifier, it would mean that the object is unique and mutable and can be modified by this thread. If the identifier was different from the current thread’s identifier, the object would be considered shared and therefore immutable.
When a thread spawned another thread, the parent thread’s identifier would be modified. This way all objects belonging to this thread would automatically be marked as shared. The objects would be directly inherited by the child thread without any complicated operations.
Now when either the parent or the child thread tried to access an object, it would still have to do a shallow copy. It is not worth mentioning that the new copy would be marked with the thread’s identifier, because all objects created by a particular thread would always be marked with thread identifier of the creator thread.
With this approach objects would still have to be referenced by handles. Therefore a copy of the handle space would have to be made for each thread when a thread is spawned.
Summary – impact on the interpreter/engine
- All objects handles would have to be translated to access actual objects.
- To mitigate the impact of object handles, local objects in functions could still be held by pointer instead of by handle.
- This would be OK even if callees spawned threads which had access to the local variables, because spawning a thread would turn the objects into shared, so the local objects would have to be replaced if they were modified later by the parent thread.
- However all handles would still be kept up to date for closures as well as in coroutines.
- Global objects and properties of local objects would still have to be translated.
- Object refinement would remain the same. Reading object properties would be unaffected.
- Object modification would involve checking object owning thread identifier and creating a shallow copy of modified shared objects for the current thread.
- Each thread would have its own handle space.
- Creating a new thread would involve copying handle space from the parent thread.
Request for comments
Do you have any ideas how to further mitigate the penalties of using object handles?
Or can you think of a better general approach to the problem of multithreading in scripting languages?