Simple method to execute function calls in parallel in Python

Jan 23, 2023

Here is a simple skeleton to set up tasks to be run in parallel.

Parallelizing CPU-intensive work

One way is to leverage the ProcessPoolExecutor:

from concurrent.futures import ProcessPoolExecutor, ThreadPoolExecutor
from time import sleep


# We'll call an expensive function 10 times, using the following arguments.
ARGUMENT_LIST = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]


def expensive_function(delay):
    """
    Just call sleep for a given number of seconds to simulate an expensive function cal.
    """
    print(f'About the process the {delay}-second delay!')
    sleep(delay)
    return delay


with ProcessPoolExecutor() as executor:
    outputs = executor.map(expensive_function, ARGUMENT_LIST)
    for output in outputs:
        print(f'Getting back the output: {output}')

You’ll note that the ten print statements f'About the process the {delay}-second delay!' appear immediately, and the last output is returned after 10s. If the calls had been executed sequentially, we would have waited one second before running the second call to expensive_function etc. And the total execution time would have been ~ 1 + 2 + 3 ... + 10 or roughly 55s!

Parallelizing I/O-intensive work

For I/O-intensive work, leverage the ThreadPoolExecutor. You can use the same structure, just replace the executor:

from concurrent.futures import ProcessPoolExecutor, ThreadPoolExecutor


with ThreadPoolExecutor() as executor:
    results = executor.map(expensive_function, ARGUMENT_LIST)

Distinguishing CPU from I/O-intensive tasks

CPU-intensive tasks use a lot of CPU cycles, such as encryption/decryption, transcoding, etc., whereas I/O-intensive tasks correspond to networking or disk reading/writing operations.