Getting ip addresses with facter
When configuring services with Puppet, you sometimes need to know the IP address of a server. For example, I export nagios host definitions from many of my servers and use them to configure my nagios service. Facter ships with two sets of network-related facts that can help.
Splitting fasta with biopython
A student in the lab associated with my employer asked me for advice on how to extract records from a FASTA file. The hitch was that he wanted a large number of records, on the order of thousands, and the FASTA file was even larger, containing tens of millions of records. The first approach that came to my mind was splitting the file into chunks that were small enough to fit into memory on the nodes in our cluster. This would allow multiple CPUs to search for the records of interest while eliminating the greatest potential performance killer, lots of disk seeks. I’d never used Biopython before, so this request seemed like a good excuse to try it. It turned out to be remarkably easy to learn enough to accomplish what I wanted; going from idea to tested code took about an hour.
Shortening disk benchmark time
When benchmarking storage configurations, it’s important that your benchmark’s set of test data be larger than the amount of memory in the computer where the benchmark is running. Otherwise, the results will be more reflective of your operating system’s caching behaviour than your storage system. For example, on a server with 8GB of RAM I would set iozone’s -g argument to 16GB. Of course, tests that write this much data can take a long time. This can be a problem when you’re trying to test many small tweaks to your storage. A way to speed this up is to use the linux kernel’s mem parameter. For example, in my GRUB configuration I could set kernel /vmlinuz ro root=/dev/example/root mem=1G. When I boot with this parameter set the system will only have 1GB of memory available, so I can reduce my test data to 2GB. This makes iterating through many storage configurations much quicker.