Developing scholarly tools

The 2014 Lexomics Research Team: (left to right): (Back) Jinnan Ge, Jonathan Gerkin, Elizabeth Peterson, Tom Armstrong, Clayton Rieck, Richard Neal, Bryan Jensen (software lead). (Front) Mark LeBlanc, Qi (Sara) Zhang, Lithia Helmreich, Jillian Valerio, Stephanie Lowell, Mike Drout (Missing: Scott Kleinman).

A Wheaton College-based research team plans to make computer text analysis more accessible to humanities scholars and students with a grant from the National Endowment for the Humanities (NEH).

The college’s Lexomics Research Group recently received a $60,000 Digital Humanities Start Up Grant to make the interface for its Lexos software package easier to learn. They also will embed video and text guides that explain how, when and why to use the program’s various analytical tools.

The grant will provide funding for the team, which includes Wheaton College professors Michael Drout and Mark LeBlanc as well as California State University Northridge Professor Scott Kleinman, to continue its efforts over the next two summers. It marks the third grant the team has received from the NEH. The team also has received support from the Mellon Foundation and from the college’s endowed funds.

“This will not be standard help that tells the user to ‘click-here’ or ‘pull-down there,’ but rather help that shows an expert discussing why you might want to take a particular step, such as dividing a novel into short segments of text” said LeBlanc, a professor of computer science.

Typically, scholars interested in text analysis come from humanities disciplines where critical and qualitative analysis receive more attention than the quantitative methods employed by computers. The group’s goal is to “lower the barriers required for computer-assisted text analysis when using a broad range of texts,” according to the team’s grant proposal.

Professors Drout and LeBlanc joined forces more than a decade ago to explore the use of computer analysis in literary research. Together they developed the course connection “Computing with Texts,” including LeBlanc’s course “Computing for Poets” in 2005, and they began applying computing power to discern patterns in language usage that offered insights into the authorship of texts. Professor of Mathematics Michael Kahn also has participated in the project, providing his expertise in statistics.

“One thing we have found that separates Lexomics from many other digital projects is that from the very beginning we decided that tools are not enough,” said Drout, an English professor. “There are tons of complicated and powerful tools out there, but nobody uses them because it’s not clear how to use them.”

In contrast, Drout said, the Wheaton team’s integration of computer science and humanities scholars has enabled them to “simultaneously develop both the tools and the techniques for using them.”

The research group also is notable because of the opportunities it creates for students from various disciplines in the sciences and the humanities to work together with faculty members and contribute to ongoing work. Indeed, the project has led to students sharing authorship for journal articles with the professors and opportunities to present at international conferences in computer science, humanities and the digital humanities.

In many cases, the experience has served as a stepping stone to advanced study and future careers. Most recently, Rosetta Berger ‘15 won admission to the Ph.D. program in linguistics at Yale University. She also recently learned that her research article had been accepted for publication in the multidisciplinary journal Viking and Medieval Scandinavia.

“The ‘soft skills’ that students gain from working on a team to build tools that actually help other scholars are as valuable as the technical skills they learn,” LeBlanc said. “For example, participating in a group review, or walkthrough, of one’s own software is a humbling but essential experience.”

Christina Nelson Conroy ‘11, who now works as an engineer for Raytheon, echoes Professor LeBlanc’s point. She recalled being regularly called upon to explain her work to a large group of students and other faculty members in development meetings during the summer of 2008.

“It helped me to realize that feedback isn’t a bad thing, and the suggestions and constructive criticism received by putting your work out there for others to review is what makes the work strongest in the end,” she said. “I’m always holding peer reviews as part of my job, and it’s important to be able to not treat each comment like a personal attack.”

Vicki Li ‘14 says that the discipline of trying to understand the goals of the literary scholars proved to be a key for her work as a support advisor at the global software firm Intersystems. “I became more intentional about translating what they wanted to technical solutions, which was challenging but essential,” she said.” In fact, I was able to talk about my experiences working with the Lexomics group and my value on the soft skills during interviews for jobs.”

Richard Neal remembers those walkthroughs vividly. “When I made changes to a tool, every so often we’d go through what we worked on with Professor Drout, to make sure what we had made not only worked, but made sense in the scope of the project, that it would be useful to him and his students.”

Now a software engineer with Microsoft, Neal said that “Sometimes our work fell short, other times it delighted, but no matter what, the feedback was invaluable in determining what we’d work on next, and in turn, what I learned working with scholars across disciplines was instrumental in the success I’ve had to date as an engineer.”

With the new grant in hand, Professors Drout and LeBlanc expect to collaborate with a large team of nearly 20 undergraduates who bring a variety of interests and skills in programming, the mathematical underpinnings of the computational analysis, literary analysis and video production.

The team will pursue a variety of related projects, including work on the user interface and refining various analytical processes through study of various Medieval, Shakespearean and Victorian texts. The team also will expand upon the tutorial videos that it began making in recent years to explain concepts of computer analysis of texts, such as how to read a dendrogram, which can visualize similarities and differences in a large text or among shorter works.

“We plan to videotape our own [Professor of Classics] Joel Relihan, who will talk about the importance of punctuation in regard to style,” said LeBlanc, explaining the team’s goals for developing the software’s new “In the Margins” feature. The ultimate goal,  he said, is to enable “a video to pop up at the point where the user might want to “Remove all punctuation” from their set of 50 poems.”