Our software work has shifted from building tools (e.g., DNA Dictionary, genome browsers) to machine learning classification experiments (LeBlanc et al. 2012, 2013).
Sharing a slice of experimental time: a Suite of Scripts
Click Puzzle Piece for further Description and Downloadable Links
- Frequency Counts of Motifs
This script assumes that the script cutter.pl has already been run. This script goes through all the files created by cutter.pl that match the type of data specified in the command line, counts the number of times each unique lmer appears in the genome as well as its reversed complementary sequence, and outputs the results into a series of .xls files one for each combination of lmer size and input file.
This particular script quieries a database to gather metadata about the bugs in the data directory. Data gathered includes the organism's reference sequence, super kingdom, group, genus, species, strain, oxygen requirements, habitat, temperature range, and pathogenic data.