Harvard’s Institute for Quantitative Social Science has developed a research tool that facilitates grouping and organizing large sets of digital documents, helping the user make sense of them through an interactive process. Making Sense of Thousands of Messages included development work on the tool and evaluated whether it could be used for library projects that involve large amounts of unstructured text. The tool, named Consilience, provides an interactive user interface to help users discover different ways of grouping sets of documents, to zoom into each cluster within a selected group, and to zoom in further into individual documents. In particular, the project used a set of email messages from the Harvard Library email archiving project to test how Consilience can provide useful clustering and categories to curate and catalog a set of text documents. Initial testing with archivists suggested that the tool has good potential to assist with processing large text collections.

Project Team: 

Mercè Crosas
Director of Product Development
Institute for Quantitative Social Science

Andrea Goethals
Manager of Digital Preservation and Repository Services
Harvard Library

Wendy Gogel
Manager of Digital Content and Projects
Harvard Library

Ellen Kraffmiller
Technical Lead of Software Development
Harvard Institute for Quantitative Scoial Science

Brandon Stewart
PhD Candidate
Harvard University Department of Government

Robert Treacy
Senior Software Architect and Engineer
Harvard Institute for Quantitative Scoial Science