Multi-core shell script

> This can allow faster training (because you can have one batch being trained on each core of a multi-core processor, at the same time), but it also reduces the rounding errors.   In this case I didn’t do parallel training because I was too [self-denigration] to figure out how to wait until all 4 cores were finished.

Good.   I like it when the postdocs get clever.    But one can get even a little more clever and actually use several cores.  The easy way to use 4 cores at once in a shell script is this:

for x in whatever
 nice your_program $x &
 if test $i -ge 4
   let i=$i+1

Running a program with an ampersand (&) gets it started in the background.    Run it with nice so that it doesn’t interfere too much with your interactive session.   Linux will spread the available jobs across all four cores.

The variable $i keeps track of how many cores are busy.  When you have all 4 busy, you call “wait” which simply waits until all your background jobs are done.  The only tricky bit is the final “wait” at the end, which handles the case where you might have only 2 of 4 processors loaded up.

Note that this isn’t really optimized, because it waits until all four cores are free before starting the next set of jobs.  Since some processes will finish early, there will be some slack time while waiting for the last process to finish.  But as long as your jobs take roughly the same amount of time to complete, this simple approach will work pretty well.

You don’t want to run more processes than cores.   They run slower because each process messes up the processor’s caches for the others.  The exception is that if your process does a lot of I/O, it may make sense to change that to -ge 5, so that the 5th process can be computing while one of the other four waits for some I/O to complete.

If the processes are wildly variable in run time (i.e. they vary by more than a factor of 2 or 3) then you need to get a bit more sophisticated.  I’d probably write a little python script and use the spread_jobs module in gmisclib, from