Understanding Kotlin’s coroutines

3. An internal, cooperative software scheduler

Coroutines are not concurrent

There is something that needs to be clarified from the beginning with coroutines : they are not threads. They are not even “sub-threads” in the sense that threads are “sub-processes” : coroutines do not run concurrently. However, among other things, they are able to :

  1. execute the code inside them in a different order from which they are written
  2. facilitate switching the execution of some code from one thread to another

Coroutines in parallel threads? Surely that sounds like concurrent execution right? Yes it is; but that is because of multi-threading, not because of coroutines. In a given thread, coroutines do not execute concurrently; and when multiple threads are running and accessing common resources, you have to expect the same concurrency issues as usual, whether you are using coroutines or not.

Since (as stated in point 2 above) coroutines make it so easy to switch execution from one thread to another, it can sometimes be confusing to follow what is executed in which thread, concurrently to what. The key is to keep track of which thread executes which coroutine. But don’t worry, this is simpler than it sounds.

Suspending functions

At the heart of coroutines are suspending functions. Suspending functions are functions that are able to, well… suspend. Here, suspend means that the function itself tells the rest of the program that it doesn’t have anything to do for the moment, and that the thread that is currently executing it could be used for something else. For what else? Another coroutine.

What makes a function suspending is that it calls other suspending functions. This is formalized by adding the suspend modifier to the function signature :

suspend fun mySuspendingFunction() {
    // ...
    // Include calls to one or more suspending functions

If functions are suspending because they call other suspending functions, there must be something at the root of this hierarchy. In fact there are multiple things there (and you could even define your own if you’d like, this is not some compiler magic), but for now we will focus on the simplest one : delay().

delay(timeMillis: Long) is not a keyword or anything, it is just a suspending function defined in Kotlin’s coroutines library that can be used by another suspending function to “sleep” for the given amount of time (in milliseconds) and give the thread to another coroutine. If there is no other coroutine to run, the thread itself is put to sleep until a coroutine has to resume (for example when a delay is expired).

Note : It is important to understand the difference between delay(), which is suspending, and Thread.sleep(), which is blocking :
delay() allows other coroutines to run during this time. It is a mechanism specific to coroutines, that only works inside a coroutine (i.e. a suspending function).
Thread.sleep() has nothing to do with coroutines, it simply blocks the thread for the required amount of time, so nothing else can be executed in this thread in the meantime. You shouldn’t use it inside a coroutine, and in fact, you should consider not using it at all in Kotlin : when you need your code to wait for something, try creating a coroutine and use delay() instead.

So, to sum up, suspending functions are good team-players that allow their function friends to have the privilege of execution when they don’t need it themselves. But how exactly does that handover of execution work?

Dispatchers and cooperative scheduling

For this purpose, coroutines rely on dispatchers. A dispatcher is a software component provided by Kotlin’s library that is responsible for managing the execution of one or more coroutines inside a thread, or a pool of threads.

You can think of dispatchers like some kind of schedulers that run coroutines instead of threads. The analogy goes like this : where schedulers manage processes and threads running on CPU cores, dispatchers manage coroutines running in threads. However, there is an important distinction between them : dispatchers are cooperative schedulers, instead of preemptive one (remember when I said this detail would come up again?). That means that a running coroutine needs to explicitly suspend in order for another to be able to resume. If a poorly-written coroutine never suspends, the other coroutines in the same thread are blocked and there is nothing the dispatcher can do about it.

Note the terminology here : threads are interrupted externally (preempted) by a scheduler, while coroutines voluntarily suspend inside a dispatcher.

There are three standard dispatchers provided by the library that are useful on a day-to-day basis : Main, Default and IO.

  • Dispatchers.Main is used for the main thread (implementation-dependent, not always provided)
  • Dispatchers.Default manages automatically a pool of background threads and is primarily intended for unloading some heavy computations from the main thread. By default, the dispatcher will create as many threads as required to run all your coroutines in parallel, up to the number of CPU cores available (there would be no performance gained by creating more threads anyway).
  • Dispatchers.IO also manages a pool of threads, but is instead intended and optimized for asynchronous IO operations (as its name implies) that wait for data from the disk, the network, or any kind of peripheral.

A fourth standard dispatcher, Dispatchers.Unconfined, is not really meant to be used directly in usual applications. Finally, it is also possible to create new dispatchers in new dedicated threads if necessary (as shown in Example 5 later).

So, when a suspending function suspends (for instance, when delay() is called), it tells the current scheduler (the one that was assigned to this coroutine when it was created) that this function doesn’t need to be executed for that amount of time; the dispatcher will not consider it for execution during this period, and will try to resume another coroutine instead. Note that, since this is a cooperative scheduling mechanism, the given delay is only a minimum : if the thread is busy with another coroutine when the delay expires, your function will have to wait until the thread is free before the dispatcher has a chance to resume it (this is demonstrated in Example 2).

Asynchronous execution

With this, it is already possible to implement asynchronous programming techniques. For example, consider the following scenario : you are developing an application with a user interface and you want to show some kind of pop-up notification to the user that automatically disappears after 5 seconds. Sounds simple enough right? However, implementing that asynchronous behavior without coroutines is not actually trivial :

  • If you use a synchronous waiting mechanism such as Thread.sleep(), you block your current thread, i.e. your main thread running the UI. This makes your application unresponsive, which is obviously not acceptable.
  • If you defer the task to a background thread, you will run into concurrency issues : UIs are generally not thread-safe so you can’t dismiss the pop-up from another thread than the main thread (Android, for instance, enforces this). Furthermore, creating a dedicated thread for such a simple task is a widely inefficient use of system resources.
  • Usually, the best solution is to use some mechanism provided by the UI framework you are working with, in order to trigger the execution of some code (here, the dismissal of the pop-up) at a later time. On Android, this is what Handler.postDelayed() is for.

Coroutines provide a more elegant and general way of implementing this behavior : just create a new coroutine inside the current (main) thread that uses delay(). Since delay() suspends the coroutine and allows the thread to continue executing other tasks, it will neither block the UI nor cause concurrency issues.

suspend fun dismissPopupAfter(timeoutMillis: Long) {
    dismissPopup() // Non-suspending function

Cancellable suspending functions

This sounds fine, but then it leads to another problem. What if that pop-up notification was inside a child window in your UI, and the user closes that window before the timeout? When the delay inside your coroutine expires, the function resumes execution even though it is not relevant anymore, and it tries to dismiss a pop-up that no longer exists. Surely, this won’t end well. Obviously it is always possible to check that the pop-up still exists before dismissing it, but that it inefficient and error-prone. Fortunately, there is a better way.

Suspending functions can (and should) be cancellable. That means that they can be cancelled from the outside and will never resume. Kotlin’s basic suspending functions, such as delay(), are cancellable, which means that any coroutine that is currently directly or indirectly suspended by delay() is itself cancellable. So in our example, you just need to cancel the coroutine when the window is closed, and it won’t try to dismiss an nonexistent pop-up later. Cancellation of coroutines, just like their suspension, is a cooperative mechanism : if your coroutine doesn’t use delay() but instead is busy processing a heavy computation, make sure that it periodically checks its cancellation status and behave accordingly.

In order to cancel a coroutine, you have to keep a handle on it, in the form of a Job object that you get when you create the coroutine. Job objects provide, among other things, an isActive property, a join() method to wait for it to complete, and a cancel() method. Note that since cancellation is cooperative, the coroutine doesn’t necessarily end immediately (it should free its resources before exiting), so a cancelAndJoin() method can be used when necessary.

See the Cancellation and Timeouts documentation for more information and examples.

Coroutine scopes and structured concurrency

However, keeping track of all the coroutines running in your program in order to cancel them when they are no longer needed can quickly become complicated and error-prone when you start having more than a handful of them. To help you, Kotlin implements a concept called structured concurrency through the mean of coroutine scopes.

A coroutine scope can be seen as a group of coroutines related to a given logic context in your application, such as a window, a service or a task. Every coroutine has to run inside a scope, and when the scope is cancelled, all the coroutines inside it are cancelled as well. Furthermore, coroutine scopes can be nested : in that case, when a top-level scope is cancelled, all the coroutines and coroutine scopes inside it are recursively cancelled. The root of this dependency tree of coroutine scopes is called GlobalScope. It is the only scope without a parent, and is tied to your application as a whole : when the application exits, all your application’s coroutines are cancelled.

While you can launch any coroutine inside GlobalScope, it is usually a better idea to create new scopes as needed, and make them child and parents of one another in a way that is logical, safe and efficient for your application. For example, Android provides coroutine scopes with every Lifecycle object, such as Activities and ViewModels : use them to safely launch coroutines that won’t leak when your Activity or ViewModel is destroyed.

Coroutine context

Related to scopes are coroutine contexts. Simply speaking, a coroutine context is a set of miscellaneous elements that provide contextual information for a given coroutine scope, such as the dispatcher to use, or an optional name for the scope (useful when debugging).

Starting a coroutine

Since suspend functions can only be called inside a coroutine scope, we need to get one to get started. Of course, we have GlobalScope already provided by the library, so we could use GlobalScope.launch { ... } to launch our coroutine, but there are two problems with that :

  1. As we saw, it defeats the purpose of structured concurrency, so it is not the safest solution to use in a real application.
  2. By default, GlobalScope uses Dispatchers.Default, which runs your coroutine on a background thread automatically created by the dispatcher. So, starting coroutines this way is not much better than simply creating a thread in the first place.

The easiest and safest solution is to use runBlocking(). This function creates a new coroutine scope in the current thread, executes the block of code that you passed to it inside the context of this scope (so you can use suspending functions directly), and blocks until this scope completes (which means all the coroutines and coroutine scopes created inside it have also completed). It is, as the official documentation puts it, “bridging between blocking and non-blocking [coroutines] worlds”. We will see how to use this function shortly in the examples.

So, with this, we now have every brick we need in order to start building interesting things with coroutines. Hopefully that wasn’t too much. Now that the theory is planted, let’s harvest the seeds with some illustrative examples.

6 thoughts on “Understanding Kotlin’s coroutines

  1. Great article. I love that your examples also cover blocking calls like Thread.sleep and how specifying the dispatcher helps in this case. This would have helped me a lot when I first started working with coroutines!

  2. This is by far one of the BEST introductions to coroutines I’ve read. I’ve shared this with my ream at work.
    Thank you for putting this together. The diagrams are amazing! I love how you build intuition, I’ve been wanted to put something like this together for some time but I don’t think I could have done such a good job.

    Small nitpick:
    In this sentence:
    >> “it would only require to change launch() for another coroutine builder : withContext()”

    Calling withContext() a coroutine *builder* can be misleading, as doesn’t actually create a new coroutine, it just changes the scheduler (or context should I say). https://pl.kotl.in/T-XZv31xL

    >> “This function suspends the current coroutine while the code inside its block is executed.”

    Again, there’s only one coroutine 🙂 just in a different context. And the coroutine doesn’t suspend get suspended when calling `withContext` (might suspend later tho)

    1. Hi Fernando, thanks for the comment and the suggestions !
      I checked again the doc of withContext and indeed, I had misunderstood its behavior. I think I was mislead by the part that says “suspends until it completes”, which at first I thought meant that the current coroutine was suspended and implicitly that a new one was created. I fixed that paragraph and I hope it is now correct, but please let me know if you see some mistake.

  3. Dude you’re amazing, your prose is completely intuitive and easy to follow. NOW I can read that damned documentation and make sense of it.

Leave a Reply

Your email address will not be published. Required fields are marked *