Simple method to execute function calls in parallel in Python
Here is a simple skeleton to set up tasks to be run in parallel.
Parallelizing CPU-intensive work
One way is to leverage the ProcessPoolExecutor
:
from concurrent.futures import ProcessPoolExecutor, ThreadPoolExecutor
from time import sleep
# We'll call an expensive function 10 times, using the following arguments.
ARGUMENT_LIST = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
def expensive_function(delay):
"""
Just call sleep for a given number of seconds to simulate an expensive function cal.
"""
print(f'About the process the {delay}-second delay!')
sleep(delay)
return delay
with ProcessPoolExecutor() as executor:
outputs = executor.map(expensive_function, ARGUMENT_LIST)
for output in outputs:
print(f'Getting back the output: {output}')
You’ll note that the ten print statements f'About the process the {delay}-second delay!'
appear immediately, and the last output is returned after 10s. If the calls had been executed sequentially, we would have waited one second before running the second call to expensive_function
etc. And the total execution time would have been ~ 1 + 2 + 3 ... + 10
or roughly 55s!
Parallelizing I/O-intensive work
For I/O-intensive work, leverage the ThreadPoolExecutor
. You can use the same structure, just replace the executor:
from concurrent.futures import ProcessPoolExecutor, ThreadPoolExecutor
with ThreadPoolExecutor() as executor:
results = executor.map(expensive_function, ARGUMENT_LIST)
Distinguishing CPU from I/O-intensive tasks
CPU-intensive tasks use a lot of CPU cycles, such as encryption/decryption, transcoding, etc., whereas I/O-intensive tasks correspond to networking or disk reading/writing operations.