Last Updated: February 25, 2016
·
4.096K
· themichael'tips

Python: Threads for monitoring Multiprocessing

Introduction

Parallel programming can be done using multi-threading, multi-processing and hybrid approach. Sometimes can be useful monitoring and/or interact with multi-processing through multithreading. Here I give you an example.

Use Case

In a multiprocessing system workers produces data that must be stored every n-seconds.

We need a worker process and a save method.
Workers can use a queue to store data.

import time
import threading
import multiprocessing

def worker(args):
    data = args[0]
    queue = args[1]
    print 'Worker with data %d' % data
    #...do stuff with your data...
    time.sleep(1.0)
    queue.put_nowait(data)

def save():
    print 'Saving...'
    with open("data.txt", "a") as file:
        while not queue.empty():
            file.write("%d\n" % queue.get())
    if not workers.ready():
        threading.Timer(0.2, save).start()

Using the multiprocessing.Manager().Queue() is possible to store and share data between process and threads.

A simple use of it:

manager = multiprocessing.Manager()
queue = manager.Queue()
pool = multiprocessing.Pool(4)
workers = pool.map_async(worker, [(data, queue) for data in range(10)])
save()

In the save() function the trick is rescheduling itself if workers still working: This is not recursion!

Exercise: try to modify the save() function signature in order to accept the queue and workers variables as argument.