DEV Community

Cover image for Goroutines: Solving the problem of efficient multithreading
parthlaw
parthlaw

Posted on

Goroutines: Solving the problem of efficient multithreading

Introduction

Goroutines are the feature provided by Golang to make concurrent programming simpler and more efficient, eliminating the need to manually manage the threads.
They have a simple syntax:

go functionToRun()
Enter fullscreen mode Exit fullscreen mode

Although it may seem like a traditional multithreading, it is much more efficient than that.

Concept of G, M and P

The document explaining the Goroutines explains the concept of G, M and P orchestration to manage Goroutines efficiently.
G: It refers to Goroutines.
M: It refers to worker threads which executed G i.e. Goroutines.
P: It refers to the processor. It represents a resource which is required to execute Golang code.
To execute a Go code, M must have a P.
The whole process has three main components:

  • Global Run Queue: Queue of Goroutines to be executed. This queue is global and accessible to all M's and P's.
  • Local Run Queue: This is also a queue of Goroutines ready for execution. However, this queue is local i.e. per P.
  • Local Scheduler: The scheduler schedules which G to execute and when. The scheduler is local i.e. per P. The idea behind keeping the scheduler local is to avoid bottlenecks in speed due to the scheduler (if all P's depend on one scheduler).
  • P-array: Array of all P's available. The size of the array is equal to GOMAXPROCS env variable.

What happens when we submit a Goroutine?

So what happens when we submit a Goroutine? There are two sides to the whole flow. Let's break it down step by step:

Goroutine Side

  1. We submit a Goroutine using go keyword.
  2. The worker thread in which Goroutine is submitted (it can be a main or any other thread) i.e. M has an associated P and a scheduler (comes with P).
  3. Two paths submitted Goroutine i.e. G can take:
    • Add it to the local run queue so that the current M picks it up and executes it using P.
    • Add it to the global run queue so that any other M picks it up and executes it using the corresponding P.
  4. The decision to submit it to the appropriate queue is taken by the scheduler based on various factors:
    • Availability of space in the local run queue.
    • Availability of current M i.e. current M is not blocked by any operation such as waiting for I/O. If it is blocked then G is added to the global run queue.
    • Priority considerations: The scheduler is preemptive priority-based. If the submitted Goroutine has lower priority than others then it is added to the global run queue.

Diagrammatic illustration from a single M POV

Image description

Worker thread Side

  1. When an M is out of work i.e. it has no G waiting to be executed in the local run queue, it immediately starts looking for a G to execute from the global run queue.
  2. In case it does not find any G from the global run queue, it immediately goes into the spinning state.
    • Spinning State: It is the state in which the M (worker thread) is constantly looking for work in other P's run queues.
  3. When a thread in a spinning state finds work, it unparks an additional worker-thread to make it in a spinning state. This makes sure that there is at least one spinning M available.
    • Parking: An M (or worker thread) is said to be in a parking state when it is put into a suspended or waiting state to conserve resources. In this state, the execution of M is paused until it is unparked.
    • Unparking: When the execution of a parked M is resumed, it is said to be in the unparked state.
    • Unparking of an M occurs in two scenarios when a G is submitted: i). When there are no spinning M. ii). There is an idle P.
  4. The new spinning M starts looking for work.
  5. If an M does not have an associated P and it is willing to execute some Go code then it pops a P out from the P-array.
  6. When a P is done with executing G, it pops another G from its own list of runnable G's. If the list is empty, it chooses a random P and steals work from it. This is work-stealing algorithm.

The concept of spinning worker thread ensures speed and maximum CPU utilization.

Minimum number of spinning is 1 (as ensured by the algorithm) and the maximum can be GOMAXPROCS env variable.

I tried to represent the flow visually.

Image description

Conclusion

The explained G, M and P orchestration is explained in detail in the Design Doc provided in the code explination. You can go through to get a detailed description and more understanding of the algorithm.
I tried to explain the concept in brief. Please point out if there is any inaccuracy in the content.🙌

Top comments (0)