May 11, 2023: Datathon Showcase – Computational Archival Storytelling with Jupyter Notebooks¶
Introduction¶
INST742 is called Implementing Digital Curation.
Catalog Description: Management of and technology for application of digital curation principles in specific settings. Characteristics, representation, conversion, and preservation of digital objects. Applications of standards for digitization, description, and preservation. Planning for sustainability, risk mitigation and disaster recovery.
Extended Course Description: The class is designed to provide hands-on learning experiences to students, with real-world environments and examples that touch on significant areas of digital curation. Selected topics will be chosen to allow students to find and explore best current practices in the use of representative tools. There will be assigned readings each week with opportunities to experiment with software environments and manage data and records. Due to the rapidly evolving nature of the field, the tools and topics will reflect current trends.
Learning Outcomes: Upon completion of this course, students will be able to:
- Demonstrate familiarity in the curation of digital records and data.
- Understand and evaluate digital curation systems.
- Demonstrate knowledge of major classes of digital curation environments.
Event¶
On May 11, 2023, a public event showcased original and innovative work conducted by MLIS graduate students in the INST742 class (“Implementation of Digital Curation”) at the University of Maryland iSchool. The class (designed to provide hands-on learning experiences to students, with real-world environments and examples that touch on significant areas of digital curation) concluded with a 2-week final project. Digital Curation Implementation topics explored in INST742 included:
- Archival Science concepts and workflows and Computational Thinking (CT)
- Digitization management (ABBYYFineReader)
- Cleaning & Transforming (OpenRefine)
- Data Wrangling (Trifacta)
- Clustering algorithms (Artificial Intelligence)
- Text Processing through NLP and NER (GATE: General Architecture for Text Engineering)
- Geospatial Transformations through: geocoding, geolocating, georeferencing, and vectorizing/tracing (QGIS, ArcGIS)
- Data visualization (Tableau Storyline and Tableau Dashboard)
- Network analysis through graph databases (Neo4j)
- Digital Curation at scale
- Virtual machines (UMD iSchool Virtual Computing Lab (VCL), Sandbox tools, Jupyter Notebooks)
This year, the entire 15-week class was articulated around a single collection, consisting of a sample of the 1911 Charlotte NC city directory.
“City directories are among the most important sources of information about urban areas and their inhabitants. They provide personal and professional information about a city’s residents as well as information about its business, civic, social, religious, charitable, and literary institutions.” (Library of Congress).
Paper¶
Post-class, students were invited to co-author a book chapter on their experiences in INS742. See "2024 Archives and Primary Source Handbook, peer-revieweed open-access NewPrairiePress textbook chapter at:
“Teaching and Learning with Archival Materials through the Development of Interactive Computational Notebooks”, P. Piety, M. Conrad, R. Marciano, I. Cornfield, E. Dallimore, R. Fettig, E. Hansen, H. Kemp, T. Turabi (2023). https://ai-collaboratory.net/wp-content/uploads/2023/10/Piety_Conrad_Marciano_et_al-FINAL.pdf
Speakers and Topics¶
- Eden Hansen: Mad or Madam: Investigating an Undefined Data Term
- Sams Wilson: Mapping Over Time in Charlotte NC: Population, Redlining, and Urban Renewal
- Bethany Greenho: Building a Bigger Picture: A Case Study of Combining the General City and Business Directories
- Rosemarie Fettig: Expanding the Network: Modeling Relationships with Neo4j
- Valerie Sallis: Revisualizing Geographic Disparities: Examining Trends in Racial and Economic Inequality on the Streets w/o GIS
- Mia Steinle: Religious Life in 1911 Charlotte, NC
- Sarah Craig: Gender, Race, and Archival Silences
- Elissa Dallimore: Conceptualizing Prosperity: A Case Study Analyzing Housing through Job Types
- Henry Kemp: Visualizing Neighborhood Demographics
- Isaiah Cornfield: Race, Marriage, and Profession: Data at Scale Test Case
The following tables shows how each of the 10 final projects connected with Computational Thinking Practices and Tools (including levels of engagement with Jupyter Notebooks):
- Eden Hansen: Mad or Madam: Investigating an Undefined Data Term
- Sams Wilson: Mapping Over Time in Charlotte NC: Population, Redlining, and Urban Renewal
- Bethany Greenho: Building a Bigger Picture: A Case Study of Combining the General City and Business Directories
- Rosemarie Fettig: Expanding the Network: Modeling Relationships with Neo4j
- Valerie Sallis: Revisualizing Geographic Disparities: Examining Trends in Racial and Economic Inequality on the Streets w/o GIS
- Mia Steinle: Religious Life in 1911 Charlotte, NC
- Sarah Craig: Gender, Race, and Archival Silences
- Elissa Dallimore: Conceptualizing Prosperity: A Case Study Analyzing Housing through Job Types
- Henry Kemp: Visualizing Neighborhood Demographics
- Isaiah Cornfield: Race, Marriage, and Profession: Data at Scale Test Case
Acquiring or Accessing the Data¶
Students used the 1911 Charlotte Historical Directory for their work. The class focused on the first 5 pages, however, for the final project students had access to the entire datafied directory to explore processing and access at scale.
Audience¶
- Richard Marciano: INST742 Instructor, UMD
- Rogers Hall: Professor & Chair, Dep. of Teaching and Learning, Vanderbilt U.
- Mark Conrad:ex-digital archivist at the National Archives, Advanced Information Collaboratory (AIC)
- Greg Jansen: Senior Research Software Architect, U. Maryland iSchool
- Sarah Buchanan: Associate Professor, School of Information Science & Learning Technologies, U. Missouri
- Mark Hedges: Professor, Chair of the Digital Humanities Dep., King’s College London
Notebooks links & YouTube video presentation links¶
- Notebook: Eden Hansen -- Video: https://youtu.be/8fu0UrGJRfE (8′ 03″)
- Notebook: Sams Wilson -- Video: https://youtu.be/9UWFnz-afqU (14′ 56″)
- Notebook: Bethany Greenho -- Video: https://youtu.be/4pmFzRwO_1o (11′ 48″)
- Notebook: Rosemarie Fettig -- Video: https://youtu.be/CGJgXU0o-U8 (12′ 30″)
- Notebook: Valeri Sallis -- Video: https://youtu.be/hyeYfOKnFBs (9′ 57″)
- Notebook: Mia Steinle -- Video: https://youtu.be/uIbvZMRW_-I (11′ 10″)
- Notebook: Sarah Craig -- Video: https://youtu.be/tJI28XOcMmU (12′ 40″)
- Notebook: Ellisa Dallimore -- Video: https://youtu.be/MaD9lIYM7iY (9′ 07″)
- Notebook: Henry Kemp -- Video: https://youtu.be/85-qL-VBY14 (12′ 04″)
- Notebook: Isaiah Cornfield -- Video: https://youtu.be/hyeYfOKnFBs (7′ 41″)