Suite of Scripts
Click Puzzle Piece for further Description and Downloadable Links
Note: In order for each script to work properly you must download the whole suite of scripts and save them into a common directory.
-
Cut Genomes into Chunks
Cutter.pl ("Script #1") is the second of a suite of scripts designed to assist in the analysis of DNA. This particular script breaks a large DNA sequence down into several smaller chunks of user-determined size.
cutter.zip - ReadMe
-
Frequency Counts of Motifs
This script assumes that the script cutter.pl has already been run. This script goes through all the files created by cutter.pl that match the type of data specified in the command line, counts the number of times each unique lmer appears in the genome as well as its reversed complementary sequence, and outputs the results into a series of .xls files one for each combination of lmer size and input file.
countMotifs.zip - ReadMe
-
Prepare Data for R
This script takes the various motif counts created by the motifCounts.pl script and combines them into an single .xls file for use in satistical anaylsis and also adds some additional metadate.
prepare4R.zip - ReadMe
Additional Scripts
-

-

extractGroupPhylum
This particular script quieries a database to gather metadata about the bugs in the data directory. Data gathered includes the organism's reference sequence, super kingdom, group, genus, species, strain, oxygen requirements, habitat, temperature range, and pathogenic data.
extractGroupPhylum.zip - ReadMe