Keeping your history

Logbooks are important in research, but (realistically), they are completely useless if you do your work on the computer.    Back in grad school, I tried, but it was hopeless.   One spends all day printing things out and pasting them in; it is closer to a primary school art class than real research.

If you program, it’s even worse.   Suppose you change a few lines in the midst of a 1000 line program.   What do you print and glue into your logbook?  Nothing you can put there will actually be helpful.  So, use a logbook if you spend almost all your time in a real lab (one with beakers), but the rest of us will have to do something else.

But, the idea behind logbooks cannot be ignored.  It’s hard to remember what you did when, in a complex project.   And, sometimes you will need to go back and check your work, or a reviewer of one of your papers will request more explanation, or whatever.  Without good logs, you may need to re-compute your entire paper from scratch.  [Without logbooks, just think how embarrassing true honesty would be.  Suppose you work hard and analyze your heart out, and get a result that is statistically significant at the 0.01% level.  You’d have to report that there is a 25.01% chance that someone who repeats your experiment will get different results: 0.01% from the statistics and 25% because that’s the odds you forgot to report some important part of the method.]

So, I do almost everything with scripts (that leaves me with a record of my computations), and the scripts leave log files, and intermediate data files have headers to show where they come from.  That catches the biggest parts of  the computations, but little things still slip through the cracks.

Now, I’ve figured out how to log all the commands I type into a terminal on Linux to preserve an archive, in case I need to go back and figure out exactly what I was doing.

The standard bash shell already has a mechanism for this.  It’s called the .bash_history file.  All we need to do is use that mechanism to make a permanent archive.   To do it, just add these few lines to your .bashrc file in your home directory:

#This creates an archive of all shell commands typed:
test -d ~/history || mkdir ~/history
history_exit_trap() {
dhist=$HOME/history/${HOSTNAME:-H}-$((${LINES:-40}/8)),$((${COLUMNS:-80}/12))-D${#PWD}-T$(date +%Y-%m).txt
history -a $dhist && echo "####EOH####" $(pwd) $(date +%Y-%m) >>$dhist
trap "history_exit_trap" EXIT

The “test…” line creates a folder to store your achives.   Then, history_exit_trap is a shell function that appends the commands you have executed to an archive file.  It cooks up a filename that depends on the month, the size of your terminal, and the name of your working directory.  So, especially if your window manager preserves your windows, and if you don’t stretch your windows too often, commands from a given project will tend to end up in the same archive file.  Finally, the trap line arranges for the shell function to be called when your shell exits (i.e. when you log out and/or your terminal closes).

Do this, and your ~/history directory will fill up with files named like mace-5,7-D17-T2010-01.txt .  Internally, each one will look like this:

cd history/
####EOH#### /home/gpk/history 2010-01

With a bunch of commands and then a EOH line with the date and your working directory.

(I thank Yaroslav Halchenko and the Bash Hacker’s Wiki for inspiration.)

Note added Jan 15 2010:

The above scheme works nicely, except that it seems to turn off the normal .bash_history file.    My new scheme is to delete those lines from .bashrc and add the following stuff to .bash_logout.  (That’s a better place for it, anyway.)

test -d ~/history || mkdir ~/history
dhist=$HOME/history/${HOSTNAME:-H}-$((${LINES:-40}/8)),$((${COLUMNS:-80}/12))-D${#PWD}-T$(date +%Y-%m).txt
history -a "$dhist" && echo "####EOH####" $(pwd) $(date +%Y-%m) >>"$dhist"
if test $(($$%10))=0;
history -n "$HISTFILE" && history -w "$HISTFILE"
history -a "$HISTFILE"

The first three lines create and write the $HOME/history directory.   The last six lines write to your normal .bash_history file.