Asynchronous Python [Part 1]: Mastering GIL, Multithreading, and Multiprocessing

Introduction - Stepping into Asynchronous Python

Asynchronous programming introduces a paradigm shift by allowing actions to occur independently of one another (rather than needing to wait for one activity to complete before moving on to the next). The ability to conduct operations concurrently can dramatically increase the speed of our programs, whether it's handling multiple network connections, processing big volumes of data, or building responsive user interfaces. In this article, we will unpack the beauty of asynchronous programming in Python, demystifying difficult concepts while keeping it delightfully simple.

Key Terms - Unpacking the Python Asynchronous Glossary

Before we go into the details, let's define some key technical terms:

Synchronous (sync): actions occur sequentially, one after another. This aligns with the traditional programming approach where each line of code executes in the order it's written.
Asynchronous (async): in contrast to synchronous actions, asynchronous actions defy a fixed chronological order. They can occur independently or in an arbitrary sequence, introducing a dynamic and responsive programming paradigm.
Concurrency: the ability of programs to execute tasks in varying orders every time the program runs without affecting the outcome.
Parallelism: simultaneous execution of two or more tasks, genuinely occurring in parallel utilizing multiple resources at once.
Multitasking: executing tasks one at a time, rapidly switching between them to create the illusion of simultaneous execution.
- Preemptive Multitasking:
  - Definition: Employed by Linux and most modern OS, preemptive multitasking grants the kernel complete control over the time each process is allowed to run.
  - Mechanism: The kernel can preempt a process if it runs for an extended period, ensuring fair allocation of CPU resources among various tasks.
- Cooperative Multitasking:
  - Definition: In contrast, cooperative multitasking relies on processes voluntarily yielding control to the operating system.
  - Mechanism: A process retains control until it willingly passes it to the operating system, which may lead to uneven distribution of resources if a process does not cooperate.
Thread: the fundamental code execution unit capable of operating independently in one of the computer's cores.
Process: Comprises one or more threads and includes necessary resources (shared among threads) like network connections, input devices, or specific cores for thread execution.
GIL (Global Interpreter Lock): A crucial resource (mutex) in any Python program that is unique to each Python process. It ensures that only one thread at a time runs Python bytecode, influencing how Python manages concurrent threads.
CPU Bound Tasks: operations that primarily rely on the computational power of the CPU. These tasks involve complex calculations, data processing, and mathematical operations where the CPU is the bottleneck.
I/O Bound Tasks: operations that spend a significant amount of time waiting for input/output operations to complete. These tasks include file operations, network requests, and database queries where the CPU is often idle while waiting for external resources.

The GIL - Decoding the Global Interpreter Lock

What is the GIL (Global Interpreter Lock) and why does it exist in Python?

The GIL, or Global Interpreter Lock, is a mechanism in CPython (the reference Python interpreter) that ensures only one thread executes Python bytecode at a time. The GIL prevents race conditions and ensures thread safety.

This lock is necessary to manage dynamic memory in CPython, as its memory management is not thread-safe. The GIL prevents multiple threads from concurrently running Python code to avoid potential issues like race conditions and deadlocks related to memory management.

What is the tradeoff of using GIL?

The GIL is a compromise between allowing multi-threaded code and ensuring the correctness of CPython's dynamic memory management. Without the GIL, multiple threads could interfere with each other, leading to data corruption and other memory-related issues. The lock provides a balance between thread safety and performance, preventing the complexities associated with fully thread-safe memory management.

Can the GIL be bypassed in Python?

To bypass the GIL in CPython, some Python libraries use C extensions. However, bypassing the GIL requires careful consideration, and not all operations can be easily moved outside its constraints. Fortunately, many operations that might potentially be blocked or are long-running, such as I/O operations, image processing, and NumPy number crunching, inherently occur outside the GIL.

Are there alternatives to the GIL in other Python implementations?

Yes, alternative Python implementations like Jython and IronPython, built on Java and .NET platforms respectively, do not have a GIL. These implementations leverage different approaches to dynamic memory management, allowing Python code to run in multiple threads simultaneously without the need for a Global Interpreter Lock.

What is a recommended solution for overcoming the GIL's limitations?

For CPython users facing limitations imposed by the GIL, the recommended solution is to leverage the multiprocessing module instead of multithreading. Unlike threads, multiprocessing creates separate processes, each with its own GIL. This allows multiple processes to run Python code concurrently, effectively utilizing multiple processor cores and providing a more efficient solution for parallel execution.

Why hasn’t GIL been removed yet in Python?

The GIL remains in Python due to the challenges associated with finding a suitable replacement that aligns with Python's design principles, effectively addresses concurrency issues, and maintains or improves the speed of single-threaded and multithreaded programs. The requirement to optimize for common use cases, coupled with the complexity of addressing concurrency issues, underscores the cautious approach toward removing the GIL.

Note: PEP 703 signals a potential future direction for Python, where the GIL may become optional, enhancing the language's ability to handle concurrent tasks more efficiently.

Unraveling Multithreading - Concurrency in Python

As previously stated, the GIL ensures that only one thread can run Python bytecode at a time. This means that Python threads do not attain real parallelism across many CPU cores. However, python does allow for cooperative multitasking. Let's look at how multithreading works for CPU and I/O-bound operations.

Let’s create a module called tasks.py to create simple CPU and I/O-bound tasks:

python
import time
import requests

def cpu_bound_task():
    result = 0
    for i in range(20000000):
        result += i**2
    return result

def io_bound_task():
    response = requests.get("https://www.example.com")
    return len(response.text)

def run_cpu_bound_task():
    print("Starting CPU bound Task ...")
    start_time = time.time()
    cpu_bound_task()
    print(f"CPU bound Task Time taken: {time.time() - start_time} seconds")

def run_io_bound_task():
    print("Starting I/O bound Task ...")
    start_time = time.time()
    io_bound_task()
    print(f"I/O bound Task Time taken: {time.time() - start_time} seconds")

The GIL significantly impacts the parallel execution of CPU-bound tasks in Python. As only one thread can execute Python bytecode at a time due to the GIL, the performance gains typically associated with parallel processing on multiple CPU cores are limited.

python
import time

from threading import Thread
from tasks import run_cpu_bound_task

if __name__ == "__main__":
    # With a single thread, run two CPU-bound tasks
    start_time = time.time()
    run_cpu_bound_task()
    run_cpu_bound_task()
    print(f"Single thread total time: {time.time() - start_time} seconds \n\n")

    # Using threading to run two CPU-bound tasks
    thread1 = Thread(target=run_cpu_bound_task)
    thread2 = Thread(target=run_cpu_bound_task)

    start_time = time.time()

    # Start both threads
    thread1.start()
    thread2.start()

    # Wait for both threads to finish
    thread1.join()
    thread2.join()

    print(f"Multi thread total time: {time.time() - start_time} seconds")

bash
Starting CPU bound Task ...
CPU bound Task Time taken: 1.9319941997528076 seconds
Starting CPU bound Task ...
CPU bound Task Time taken: 1.989990472793579 seconds
Single thread total time: 3.9239988327026367 seconds 

Starting CPU bound Task ...
Starting CPU bound Task ...
CPU bound Task Time taken: 3.811995267868042 seconds
CPU bound Task Time taken: 3.8019750118255615 seconds
Multi thread total time: 3.834996223449707 seconds

Here, the GIL prevents threads from utilizing multiple processor cores, affecting the overall efficiency of parallel execution.

On the other hand, IO-bound tasks are less affected by the GIL, where cooperative multitasking plays a crucial role in mitigating its impact. During the waiting periods, a thread can release the GIL, allowing other threads to execute Python bytecode. This is where cooperative multitasking becomes relevant.

python
import time

from threading import Thread
from tasks import run_cpu_bound_task, run_io_bound_task

if __name__ == "__main__":
    # With a single thread, to run CPU-bound task and I/O bound task
    start_time = time.time()
    run_cpu_bound_task()
    run_io_bound_task()
    print(f"Single thread total time: {time.time() - start_time} seconds \n\n")

    # Run CPU-bound task and I/O bound task in two thread
    thread_io = Thread(target=run_io_bound_task)
    thread_cpu = Thread(target=run_cpu_bound_task)

    start_time = time.time()

    # Start both threads
    thread_io.start()
    thread_cpu.start()

    # Wait for both threads to finish
    thread_io.join()
    thread_cpu.join()

    # Print results
    print(f"Multi thread total time: {time.time() - start_time} seconds")

bash
Starting CPU bound Task ...
CPU bound Task Time taken: 1.965041160583496 seconds
Starting I/O bound Task ...
I/O bound Task Time taken: 1.65578031539917 seconds
Single thread total time: 3.6228082180023193 seconds 

Starting I/O bound Task ...
Starting CPU bound Task ...
I/O bound Task Time taken: 1.8610336780548096 seconds
CPU bound Task Time taken: 1.9550292491912842 seconds
Multi thread total time: 1.9580087661743164 seconds

Here, the GIL has a minimal impact on performance, allowing threads to remain productive and do cpu_bound_task during waiting periods for I/O operations.

Multiprocessing Unlocked - Power of Multiple Processors

Python's multiprocessing module provides support for creating multiple processes, allowing for parallel execution of tasks and taking advantage of multi-core processors. Unlike threading, multiprocessing uses separate processes with their own memory space, which can lead to better performance for CPU-bound tasks.

python
import time

from multiprocessing import Process
from tasks import run_cpu_bound_task

if __name__ == "__main__":
    # Using multiprocessing to run two CPU-bound tasks
    process1 = Process(target=run_cpu_bound_task)
    process2 = Process(target=run_cpu_bound_task)

    start_time = time.time()

    # Start both processes
    process1.start()
    process2.start()

    # Wait for both processes to finish
    process1.join()
    process2.join()

    end_time = time.time()

    print(f"Total time taken: {end_time - start_time} seconds")

bash
Starting CPU bound Task ...
Starting CPU bound Task ...
CPU bound Task Time taken: 2.020033597946167 seconds
CPU bound Task Time taken: 2.077033758163452 seconds
Total time taken: 2.4200003147125244 seconds

By utilizing multiprocessing, we can potentially observe performance improvement for CPU-bound tasks as they can run concurrently in separate processes. This takes advantage of multiple CPU cores, reducing the overall execution time.

While multiprocessing can offer performance benefits, it comes with some limitations. One significant challenge is Inter-Process Communication (IPC). Since each process has its own memory space, sharing data between processes requires communication mechanisms like queues, pipes, or shared memory.

In scenarios where frequent communication is necessary, the overhead of IPC can offset the advantages gained from parallel processing. Therefore, it's essential to consider the nature of the task and the communication requirements when deciding between multithreading and multiprocessing.

Multithreading vs. Multiprocessing - Optimizing Choices in Python

CPU-Bound Tasks: For tasks that primarily involve heavy computations without the need for frequent communication between processes, multiprocessing is often the better choice. It enables parallel processing, harnessing the power of multiple cores to deliver improved performance.

I/O-Bound Tasks: In scenarios where the bottleneck is I/O operations and parallelism is sought for the efficient handling of numerous tasks concurrently, multithreading is a suitable option. The Global Interpreter Lock becomes less of a concern in I/O-bound situations.

Asynchronous Python [Part 1]: Mastering GIL, Multithreading, and Multiprocessing

CONTENTS

Introduction - Stepping into Asynchronous Python

Key Terms - Unpacking the Python Asynchronous Glossary

The GIL - Decoding the Global Interpreter Lock

Unraveling Multithreading - Concurrency in Python

Multiprocessing Unlocked - Power of Multiple Processors

Multithreading vs. Multiprocessing - Optimizing Choices in Python

Introduction - Stepping into Asynchronous Python

Key Terms - Unpacking the Python Asynchronous Glossary

The GIL - Decoding the Global Interpreter Lock

Unraveling Multithreading - Concurrency in Python

Multiprocessing Unlocked - Power of Multiple Processors

Multithreading vs. Multiprocessing - Optimizing Choices in Python

Popular Tags

More Reads

Python PDF Packages comparison, All you need to know (Updated 2024)

Mastering Directories and Files with Python's Pathlib

Python for Java Programmers