This web-based tool enables you to "scrub" (clean) your unicode text(s), cut a text(s) into various size chunks, manage chunks and chunk sets, tokenize with character- or word- Ngrams or TF-IDF weighting, and choose from a suite of analysis tools for investigating those texts. Functionality includes building dendrograms, making graphs of rolling averages of word frequencies or ratios of words or letters, and playing with visualizations of word frequencies including word clouds and bubble visualizations. To facilitate subsequent text mining analyses beyond the scope of this site, users can also transpose and download their matricies of word counts or relative proportions as comma- or tab-separated files (.csv, .tsv).
- Use the tool online: lexos v2.5 -- an integrated lexomics workflow
Tutorials and transcripts for lexomics analysis can be found here.
Download the software for this open-source tool:
- github -- https://github.com/WheatonCS/Lexos -- Download .zip (or clone) from github page; local install directions included
The history of our lexomics tool set began with a suite of command-line Perl scripts (2011) and proceeded to a set of three independent web-based tools (2012). Access to the previous iterations of our tools and associated software can be found here.