HathiTrust Web and UX Research

Overview

This project provides a comprehensive survey and in-depth analysis of the HathiTrust Research Center (HTRC) information landscape, focusing on enhancing web presence, usability, and overall efficiency by 15%. Through rigorous user testing and behavioral analysis, the research identifies prevailing trends and best practices in text data mining within the humanities and social sciences.

Key data science contributions include engineering natural language processing (NLP) pipelines utilizing NLTK and spaCy to analyze textual patterns across a corpus of over 50,000 documents. Predictive modeling techniques were employed to understand and anticipate user search behaviors, resulting in a notable 20% reduction in content retrieval time. Additionally, effective data visualization methods were applied to communicate insights related to document accessibility, enhancing user comprehension and interaction.

Ultimately, this study aims to proactively address and fulfill both the technical and behavioral needs of researchers utilizing text analysis methods, thereby significantly improving their research experience and productivity.

Screenshots

The Research

Analysis Charts