import threading
import multiprocessing
from multiprocessing import Queue
import os
import queue
import time
import redis
import rq

Introduction

Python by nature is single-threaded. It can be extraordinarily difficult to get around this, however several libraries strive to fix this problem. The largest problem that I've found with parallelization is that there are dozens of libraries that all try to solve this problem, but none are trivial to use, and it's hard to figure out which ones are best.

I'll go over each method that I've found to work in order of ease of use.

Poor Man's Parallelization

By far the easiest way to make code run in parallel is to create a new python instance for every thread. This is done generally by having your main code run some python process in the background using something like an os.system() call. The issue with this method is that you generally need some sort of process to keep track of your inputs and outputs. This method can be a database, a text file, some other sort of file, but it generally needs to be stored on disk. Without an ssd, this means that disk read/write speed can be a serious bottleneck.

This is by far the easiest method, but read/write times can limit speed.

Concurrency is also an issue. If your process needs to return values in some order, you can get even more bottlenecks from multiple processes trying to access the same file. You'll also need to figure out a way to determine which order values were written.

Despite all of these issues, this is often the quickest and easiest way to implement some parallelization.

The Threading Module

The threading module is also an easy way to queue up jobs in different threads. Every time you call the function, it creates a new thread that is set up to start running that function. In order to manage concurrency and return values, we'll also use a Queue, because they are generally thread-safe and play nice with parallelism.

return_values = queue.Queue()
threads = []
some_long_process = lambda x, y, q: q.put(x**y)
for i in range(8, 0, -1):
    threads.append(threading.Thread(target=some_long_process, args=(i, i**2, return_values)))

Now that we've initialized our threads, we can run them. Afterwards we can access our queue and get the values back.

list(map(lambda j: j.start(), threads))
time.sleep(0.5) # allow the jobs to finish
[item for item in return_values.queue]

[6277101735386680763835789423207666416102355444464034512896,
 256923577521058878088611477224235621321607,
 10314424798490535546171949056,
 298023223876953125,
 4294967296,
 19683,
 16,
 1]

This is trivially easy, as all we have to do is put each return value (or list of values) back into our queue which can be accessed after waiting for all threads to finish.

References:

The Multiprocessing Module

Another method to use is the multiprocessing module. For basic use, this works nearly the same. The multiprocessing module provides some extra protections that the threading module doesn't have.

return_values = Queue() # needs multiprocessing's Queue
some_long_process = lambda x, y, q: q.put(x**y)
jobs = []
for i in range(8):
    p = multiprocessing.Process(target=some_long_process, args=(i, i**2, return_values))
    jobs.append(p)
    p.start()
time.sleep(0.5)
for i in range(8):
    print(return_values.get())

1
1
16
19683
4294967296
298023223876953125
10314424798490535546171949056
256923577521058878088611477224235621321607

Again, this is really quite simple for most use. One thing to note is that we must use multiprocessing's queue instead of the standard library queue in order for the data to be accessible.

References:

RQ

RQ is a tool that uses the Redis server for parallelization. The documentation states that

Make sure that the function's module is importable by the worker. In particular, this means that you cannot enqueue functions that are declared in the main module.

This means that any function you wish parallelized you need to have in a different file.

Make sure you have Redis and python-rq installed. Then start the redis server. For each worker you want, start it with

rqworker

This rqworker instance needs to be started in the same directory that you're working in, in order to find the modules that you wish to import.

from examples import some_long_importable_process # same process that we've been working with

redis_conn = redis.Redis(host='localhost', port=6379)
queue = rq.Queue(connection=redis_conn)

jobs = []
for i in range(8):
    j = queue.enqueue_call(func=some_long_importable_process, args=(i, i**2))
    jobs.append(j)
time.sleep(0.5)
for i in range(8):
    print(jobs[i].result)

1
1
16
19683
4294967296
298023223876953125
10314424798490535546171949056
256923577521058878088611477224235621321607

These libraries will nicely divy up the jobs per worker.

References:

Other methods

I've also played with IPython's cluster libraries and tools, but they are generally much more complex than is necessary. Most multiprocessing can be performed quite easily with the above libraries and don't really require overly complex functionality.

Apache Spark makes a very nice tool, but it is more limited in application. See my other post for more information on it.

The last tool I've used is called Parallel Python. This tool works really well, however it is only for python 2. I'm trying to get away from python 2 as much as possible nowadays, so this library doesn't really work for me anymore. That being said, if you're using python 2, I highly recommend checking out this library.