#!/usr/bin/env python
# coding: utf-8
# # Gender, Race, and Archival Silences in the Charlotte 1911 Directory
# Sarah Craig
INST742, Spring 2023
#
#
#
# ## 7. Gender, Race, and Archival Silences:
# * **Author:** Sarah Craig
# * **Abstract:** Data visualization and analysis of women in the historical city directory (gender, married and widow status). Inferring meaning and in the context of archival silences and ambiguity.
# * **Dataset:** Full datified Directory (16,000 entries)
# * **Tools:** OpenRefine, Tableau
# * **Video:** https://youtu.be/tJI28XOcMmU (12′ 40″)
#
# **archival silence**
# *n.*
# 1. a gap in the historical record resulting from the unintentional or purposeful absence or distortion of documentation
#
# https://dictionary.archivists.org/entry/archival-silence.html
# For our final project, we were asked to do something with the complete data set created by OCRing and datafying the Charlotte, North Carolina 1911 city directory.
#
# As we worked with a subset of this data throughout the semester, I noticed an obvious element of archival silence: Married women outside the workforce were generally only referenced in their husband's directory entries, rather than having entries of their own. Thus, the datafication of the directory included their names only as information for their husbands. They were not represented with their own rows in the data set.
#
# My initial goal with this project was to bring these obliquely-referenced women into the data set. As I worked to mitigate this particular archival silence, however, I uncovered more archival silences in the directory. These archival silences have bearing on what we can and cannot safely infer from this data set. On a meta level, they also reflect the worldviews and biases of the directory's authors and provide insight into the social landscape of the urban Appalachian South in the early twentieth century.
# ***
#
# ## Married Men and Married Women
#
# We started with this data set: [Charlotte1911-consolidated.xlsx](./Charlotte1911-consolidated.xlsx)
#
# This data set incorporates both the business directory and the residential directory. It consists of 15,711 rows with data in some or all of the following columns:
# * Original copy from the directory entry
# * Race
# * Name (Full)
# * Title
# * Married
# * Widow-of
# * Spouse
# * In Business Directory?
# * Is Business?
# * Job
# * Company
# * Company Details (Address and/or Phone)
# * Housing (Residential Address)
# The complete data set was a little bit too large for OpenRefine to handle comfortably, so as a first step, I eliminated the data from the business directory by filtering on In Business Directory. Then I divided the remaining data into two subsets. In Excel, I filtered by nulls in the Spouse column to create the data sets 1) All Married Men and 2) All Unmarried and Widowed, which I saved as separate files.
# The goal was to create a new row for each woman referenced only in her husband's entry - in effect, creating new rows based on existing rows. OpenRefine, however, isn't set up for data manipulation on the basis of rows. It does everything through the lens of columns. So to frame this problem in a way that could be solved with OpenRefine, I had to reframe it as a columns-based problem.
#
# What data does the original directory entry encode about these women?
# * First name
# * Last name
# * Title/marital status
# * Spouse
# * Residential address, assuming same as husband's
# * Race, assuming same as husband's
#
# (In the pre-*Loving v. Virginia* South, the assumption of same-race married couples is a reasonable one, but it does risk reifying archival silences of its own in the event that any of the couples in the directory actually were mixed-race but were not designated as such. Since the underlying primary source doesn't reference such cases or leave us with any clues to discern them, it makes sense to err on the side of the most probable assumption, but it's worth bearing in mind that this information *is* grounded in an assumption and is less certain than information actually printed in the directory.)
# The few pieces of data that we have about married women map fairly neatly onto a subset of the columns from our starting data set.
#
# To prepare the data, I split Name into two columns. For the sake of simplicity, I also collapsesd Title and Married into one column.
#
# * Original copy from the directory entry
# * **Race**
# * **Last Name**
# * **First and Middle Name**
# * **Title**
# * Widow-of
# * **Spouse**
# * In Business Directory?
# * Is Business?
# * Job
# * Company
# * Company Details
# * **Housing**
[
{
"op": "core/column-split",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"columnName": "Name",
"guessCellType": true,
"removeOriginalColumn": true,
"mode": "separator",
"separator": " ",
"regex": false,
"maxColumns": 2,
"description": "Split column Name by separator"
},
{
"op": "core/column-rename",
"oldColumnName": "Name 2",
"newColumnName": "First and Middle Name",
"description": "Rename column Name 2 to First and Middle Name"
},
{
"op": "core/column-rename",
"oldColumnName": "Name 1",
"newColumnName": "Last Name",
"description": "Rename column Name 1 to Last Name"
},
{
"op": "core/text-transform",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"columnName": "Title",
"expression": "join ([coalesce(cells['Title'].value,''),coalesce(cells['Married'].value,'')],'')",
"onError": "keep-original",
"repeat": false,
"repeatCount": 10,
"description": "Text transform on cells in column Title using expression join ([coalesce(cells['Title'].value,''),coalesce(cells['Married'].value,'')],'')"
},
{
"op": "core/column-removal",
"columnName": "Married",
"description": "Remove column Married"
},
{
"op": "core/mass-edit",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"columnName": "Title",
"expression": "value",
"edits": [
{
"from": [
""
],
"fromBlank": true,
"fromError": false,
"to": "Mr"
}
],
"description": "Mass edit cells in column Title"
}
]
# At this point, I saved a copy of the data set so that I would have an updated version of All Married Men with the correct column configuration.
#
# Then, I applied a few more transformations to create the new data set All Married Women.
[
{
"op": "core/column-addition",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"baseColumnName": "First and Middle Name",
"expression": "grel:value",
"onError": "set-to-blank",
"newColumnName": "First and Middle Name 1",
"columnInsertIndex": 4,
"description": "Create column First and Middle Name 1 at index 4 based on column First and Middle Name using expression grel:value"
},
{
"op": "core/column-removal",
"columnName": "First and Middle Name",
"description": "Remove column First and Middle Name"
},
{
"op": "core/column-rename",
"oldColumnName": "Spouse",
"newColumnName": "First and Middle Name",
"description": "Rename column Spouse to First and Middle Name"
},
{
"op": "core/column-rename",
"oldColumnName": "First and Middle Name 1",
"newColumnName": "Spouse",
"description": "Rename column First and Middle Name 1 to Spouse"
},
{
"op": "core/mass-edit",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"columnName": "Title",
"expression": "value",
"edits": [
{
"from": [
"Mr"
],
"fromBlank": false,
"fromError": false,
"to": "Mrs"
}
],
"description": "Mass edit cells in column Title"
},
{
"op": "core/mass-edit",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"columnName": "Title",
"expression": "value",
"edits": [
{
"from": [
"Rev"
],
"fromBlank": false,
"fromError": false,
"to": "Mrs"
}
],
"description": "Mass edit cells in column Title"
},
{
"op": "core/column-move",
"columnName": "First and Middle Name",
"index": 5,
"description": "Move column First and Middle Name to position 5"
},
{
"op": "core/column-move",
"columnName": "First and Middle Name",
"index": 4,
"description": "Move column First and Middle Name to position 4"
},
{
"op": "core/column-move",
"columnName": "First and Middle Name",
"index": 3,
"description": "Move column First and Middle Name to position 3"
},
{
"op": "core/column-move",
"columnName": "Spouse",
"index": 5,
"description": "Move column Spouse to position 5"
},
{
"op": "core/column-move",
"columnName": "Spouse",
"index": 6,
"description": "Move column Spouse to position 6"
},
{
"op": "core/column-addition",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"baseColumnName": "Job",
"expression": "grel:null",
"onError": "set-to-blank",
"newColumnName": "Job 1",
"columnInsertIndex": 9,
"description": "Create column Job 1 at index 9 based on column Job using expression grel:null"
},
{
"op": "core/column-removal",
"columnName": "Job",
"description": "Remove column Job"
},
{
"op": "core/column-rename",
"oldColumnName": "Job 1",
"newColumnName": "Job",
"description": "Rename column Job 1 to Job"
},
{
"op": "core/column-addition",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"baseColumnName": "Company",
"expression": "grel:null",
"onError": "set-to-blank",
"newColumnName": "Company 1",
"columnInsertIndex": 10,
"description": "Create column Company 1 at index 10 based on column Company using expression grel:null"
},
{
"op": "core/column-removal",
"columnName": "Company",
"description": "Remove column Company"
},
{
"op": "core/column-rename",
"oldColumnName": "Company 1",
"newColumnName": "Company",
"description": "Rename column Company 1 to Company"
},
{
"op": "core/column-addition",
"engineConfig": {
"facets": [],
"mode": "row-based"
},
"baseColumnName": "Company Details",
"expression": "grel:null",
"onError": "set-to-blank",
"newColumnName": "Company Details 1",
"columnInsertIndex": 11,
"description": "Create column Company Details 1 at index 11 based on column Company Details using expression grel:null"
},
{
"op": "core/column-removal",
"columnName": "Company Details",
"description": "Remove column Company Details"
},
{
"op": "core/column-rename",
"oldColumnName": "Company Details 1",
"newColumnName": "Company Details",
"description": "Rename column Company Details 1 to Company Details"
}
]
# I exported this data set as All Married Women. But there was one wrinkle: The few women who were listed under their own names *and* under their husbands' names now had duplicate entries. In the interest of preserving as much data as possible for each of them, I decided to manually dedupe the data in Excel. I color coded the 205 Title = "Mrs" rows from the original data set and went down the list, combining rows as needed, incorporating several dozen duplicate entries.
#
# This process meant looking more closely at the data, and in doing so, I noticed a pattern: All of the duplicate entries were for white women, specifically. Could there really be no married Black women listed under their own names in the directory?
#
# In a later round of deduping, I would learn that a total of six married Black women did in fact have their own entries. But there was no way to discern that from the data set at this point, because none of those six women were listed as "Mrs." In fact, apart from a few presumably-masculine Reverends and the married men listed with their wives, **there were no overt markers of gender or gendered social status for any of the Black citizens in the directory.**
# In[2]:
get_ipython().run_cell_magic('HTML', '', "