log is a simple python script which takes a snapshot
of your system before and after you run a command.
It is used by simply prepending 'log' to the beginning of a command, for example: > samtools index data.bam simply becomes, > log samtools index data.bam
In doing so, log will record the following details about the execution event: - the command & parameters - execution start time - username - user permissions (eg. if run as root) - hostname - execution duration - the output - an ID unique to this execution event
log will also try to determine from your command which resources
(programs & input files) were:
both explicitly (mentioned in the command itself), or implicitly (changed on the
surrounding filesystem over the course of the execution).
This is accomplished by comparing the MD5 checksums of all the resources before and after execution. We then log all of this information in a graph database that links together input files and execution events. With this graph, all we need is a file's MD5 checksum to find it in our database, and then we can walk through any pipelines that created this file, used this file, modified this file, or deleted this file - when that happened, what the output was, etc etc.
log can also be used to backup all unique resources below a set file size.
This is great for not only backing up gene lists or other small intermittent files during the course of an analysis, but also scripts and programs in various stages of development. With the exact command line parameters and versions of the programs/scripts backed up, whole pipelines can be reprocessed years after they were run with very little effort. Please note that backups are only ever stored locally!
Finally, log offers the user a growing number of helper functions such as: - supress command output (but still log it) - run the command via screen - email the user after execution - call/sms the user after execution All of these can be set to default parameters if desired, or one-off by calling log with parameters itself, such as: > log +call callTo=004412345678 samtools sort data.bam These helper functions are actually so useful, many people run log without logging just for the sake of using these helper functions.
In keeping with the AC.GT philosphy, log is incredibly easy to install. - create an account on log.bio - download the latest version of log here - run log for the first time, starting the interactive installer Of course you can customize your installation to run your own authentication server and your own logging databases rather than use the public ones - for instructions on how to do this check out the videos below.
Like every other project we host, log is totally free and opensource.
Anyone can read the code and submit updates & improvements to get recognition in the source code :)