Digital Humanities Workshops


Since the Spring 2018 semester I have served as the Digital Humanities Graduate Specialist for Rutgers Libraries. Over this time I’ve developed a number of workshops, from basic tutorials on using software popular among digital humanists (Gephi, Paladio, Tropy, etc.) to introductory and intermediate lessons on R programming in the humanities. The links each lead to a GitHub repository page, where you can find all the materials – code, sample data, and step-by-step instructions – for running the workshops on your own.

Workshops in R

What Can Be Data in the Humanities? This workshop helps participants translate humanities materials into usable data formats. It provides a gentle introduction to programming, surveys examples of effective data-driven scholarship, and discusses some costs and benefits of translating into data.

Data 101 So you’ve finally assembled or gotten your hands on a dataset or spreadsheet. Nice work! Not sure what to do next? This introductory workshop on data organization, manipulation, and analysis teaches participants how to engage with broader patterns or systems of relation in their research.

Intro to Quantitative Text Analysis This workshop introduces participants to the basics of quantitative textual analysis in the R programming language. Participants will explore a single book through a variety of approaches, including word co-appearance, character distribution, and sentiment analysis.

Webscraping Techniques in R Ever-more information can be accessed online, but often there is no easy way to obtain it for further analysis. This workshop introduces web scraping, a technique for extracting data and data structures from public websites. Using browsers and the R programming language, it also demonstrates strategies for handling different kinds of websites.

Text Mining Approaches with Historical Newspapers These materials form a robust, two-part workshop on text analysis of Chronicling America newspapers in R. The first part introduces strategies for fuzzy string matching, using the OCR-derived text from the Perth Amboy Evening News; you’ll need to download data from Chronicling America first as directed. The second part begins with the results of the previous and explores a few possible methods for analyzing phrase use over time, page location, collocate words, and uniqueness.

Workshops with Other Tools

Network Analysis in Gephi Network analysis is one of the most popular approaches in the digital humanities because it allows us to model relations–between individuals, texts, locations, and more. These slides and sample data introduce the central concepts of network analysis before explaining how to use Gephi, one of the most popular programs for analyzing and visualizing networks.

Intro to Mapping What kind of information should be mapped? Which tool is best for the job? These slides include a primer on how to identify what kind of data is necessary or amenable to mapping, an evaluative survey of several mapping tools, and extended tutorials of StoryMapJS and Palladio.

Collecting Twitter Data with TAGS This tutorial explains how to obtain data from Twitter via the TAGS tool for Google Sheets while addressing some of the conceptual uses and practical limitations of working with social media data.

High Performance Computing for Humanists at Rutgers Sometimes humanist research can outstrip the resources of personal computers, whether because of the size of the data or the complexity of the operations in question. This tutorial explains how Rutgers affiliates can tackle such cases by utilizing Amarel, the Rutgers Office of Advanced Research Computing high performance computing environment, from their own computer.

Hugo and GitHub Pages Tutorial Personal and project websites can be effective vehicles for sharing work. This tutorial introduces Hugo, an open-source static site generator for building websites locally by editing themes and adding original content. It also covers how to deploy the result with free hosting on GitHub Pages and how to add other features like custom domain names or analytics. Accessible overviews of HTML, Git, and the command line are included on an as-needed basis too.