gmisclib.spread

A module that starts a bunch of subprocesses and distributes work amongst them, then collects the results.

Subprocesses must follow a protocol: they must listen for commands on the standard input (commands are encoded with cPickle), and they must produce cPickled tuples on their standard output. NOTE: THEY CANNOT PRINT ANYTHING ELSE! (But it's OK for subprocesses to produce stuff on the standard error output.)

Imports: re, sys, math, time, random, CP, threading, subprocess, StringIO, die, gpkmisc, dictops, MB

main(todo, list_of_args, connection_factory=<class 'gmisclib.spread_jobs.Connection_subprocess'>, stdin=None, verbose=False, timing_callback=None, tail_callback=None, past_performance={})

source code

Pass a bunch of work to other processes.

Parameters:

stdin (list(whatever)) - a list of stuff to send to the other processes before the computation is properly commenced.
todo (sequence(whatever)) - a sequence of tasks to do
list_of_args (list(list(str)))
past_performance (None or PastPerformance.) - a PastPerformance object if you want the system to remember which machines were more/less successful last time and to start jobs on the more successful machines first. None if you don't want any memory. The default is to have memory.

Returns: tuple(list(whatever), list(list(str)))

A 2-tuple. The first item is a list of the results produced by the other processes. Items in the returned list correspond to items in the todo list. These are the stuff that comes out, pickled, on the standard output after each chunk of data is fed into the standard input. The second item is a list of the remaining outputs, as read by file.readlines(); these are one per process.

Classes
	notComputed A singleton marker for values that haven't been computed.
	NoResponse
	RemoteException An exception that corresponds to one raised by a subprocess.
	TooBusy
	PastPerformance This class keeps track of which machines are more and less successful.
	CannotCreateConnection
	Connection This class represents a connection from the master process down to one of the slaves.
	Connection_subprocess This is a Connection via stdin/stdout to a subprocess.
	workers_c This creates a group of worker threads that take tasks from the iqueue and put the answers on the oqueue.
	unpickled_pseudofile For testing.

Variables
	__package__ = `'gmisclib'`

Module spread_jobs

main(todo, list_of_args, connection_factory=<class 'gmisclib.spread_jobs.Connection_subprocess'>, stdin=None, verbose=False, timing_callback=None, tail_callback=None, past_performance={})