#!/usr/bin/env python # coding: utf-8 # # Gender, Race, and Archival Silences in the Charlotte 1911 Directory # Sarah Craig
INST742, Spring 2023

# # # # ## 7. Gender, Race, and Archival Silences: # * **Author:** Sarah Craig # * **Abstract:** Data visualization and analysis of women in the historical city directory (gender, married and widow status). Inferring meaning and in the context of archival silences and ambiguity. # * **Dataset:** Full datified Directory (16,000 entries) # * **Tools:** OpenRefine, Tableau # * **Video:** https://youtu.be/tJI28XOcMmU (12′ 40″) # # **archival silence** # *n.* # 1. a gap in the historical record resulting from the unintentional or purposeful absence or distortion of documentation # # https://dictionary.archivists.org/entry/archival-silence.html # For our final project, we were asked to do something with the complete data set created by OCRing and datafying the Charlotte, North Carolina 1911 city directory. # # As we worked with a subset of this data throughout the semester, I noticed an obvious element of archival silence: Married women outside the workforce were generally only referenced in their husband's directory entries, rather than having entries of their own. Thus, the datafication of the directory included their names only as information for their husbands. They were not represented with their own rows in the data set. # # My initial goal with this project was to bring these obliquely-referenced women into the data set. As I worked to mitigate this particular archival silence, however, I uncovered more archival silences in the directory. These archival silences have bearing on what we can and cannot safely infer from this data set. On a meta level, they also reflect the worldviews and biases of the directory's authors and provide insight into the social landscape of the urban Appalachian South in the early twentieth century. # *** # # ## Married Men and Married Women # # We started with this data set: [Charlotte1911-consolidated.xlsx](./Charlotte1911-consolidated.xlsx) # # This data set incorporates both the business directory and the residential directory. It consists of 15,711 rows with data in some or all of the following columns: # * Original copy from the directory entry # * Race # * Name (Full) # * Title # * Married # * Widow-of # * Spouse # * In Business Directory? # * Is Business? # * Job # * Company # * Company Details (Address and/or Phone) # * Housing (Residential Address) # The complete data set was a little bit too large for OpenRefine to handle comfortably, so as a first step, I eliminated the data from the business directory by filtering on In Business Directory. Then I divided the remaining data into two subsets. In Excel, I filtered by nulls in the Spouse column to create the data sets 1) All Married Men and 2) All Unmarried and Widowed, which I saved as separate files. # The goal was to create a new row for each woman referenced only in her husband's entry - in effect, creating new rows based on existing rows. OpenRefine, however, isn't set up for data manipulation on the basis of rows. It does everything through the lens of columns. So to frame this problem in a way that could be solved with OpenRefine, I had to reframe it as a columns-based problem. # # What data does the original directory entry encode about these women? # * First name # * Last name # * Title/marital status # * Spouse # * Residential address, assuming same as husband's # * Race, assuming same as husband's # # (In the pre-*Loving v. Virginia* South, the assumption of same-race married couples is a reasonable one, but it does risk reifying archival silences of its own in the event that any of the couples in the directory actually were mixed-race but were not designated as such. Since the underlying primary source doesn't reference such cases or leave us with any clues to discern them, it makes sense to err on the side of the most probable assumption, but it's worth bearing in mind that this information *is* grounded in an assumption and is less certain than information actually printed in the directory.) # The few pieces of data that we have about married women map fairly neatly onto a subset of the columns from our starting data set. # # To prepare the data, I split Name into two columns. For the sake of simplicity, I also collapsesd Title and Married into one column. # # * Original copy from the directory entry # * **Race** # * **Last Name** # * **First and Middle Name** # * **Title** # * Widow-of # * **Spouse** # * In Business Directory? # * Is Business? # * Job # * Company # * Company Details # * **Housing** [ { "op": "core/column-split", "engineConfig": { "facets": [], "mode": "row-based" }, "columnName": "Name", "guessCellType": true, "removeOriginalColumn": true, "mode": "separator", "separator": " ", "regex": false, "maxColumns": 2, "description": "Split column Name by separator" }, { "op": "core/column-rename", "oldColumnName": "Name 2", "newColumnName": "First and Middle Name", "description": "Rename column Name 2 to First and Middle Name" }, { "op": "core/column-rename", "oldColumnName": "Name 1", "newColumnName": "Last Name", "description": "Rename column Name 1 to Last Name" }, { "op": "core/text-transform", "engineConfig": { "facets": [], "mode": "row-based" }, "columnName": "Title", "expression": "join ([coalesce(cells['Title'].value,''),coalesce(cells['Married'].value,'')],'')", "onError": "keep-original", "repeat": false, "repeatCount": 10, "description": "Text transform on cells in column Title using expression join ([coalesce(cells['Title'].value,''),coalesce(cells['Married'].value,'')],'')" }, { "op": "core/column-removal", "columnName": "Married", "description": "Remove column Married" }, { "op": "core/mass-edit", "engineConfig": { "facets": [], "mode": "row-based" }, "columnName": "Title", "expression": "value", "edits": [ { "from": [ "" ], "fromBlank": true, "fromError": false, "to": "Mr" } ], "description": "Mass edit cells in column Title" } ] # At this point, I saved a copy of the data set so that I would have an updated version of All Married Men with the correct column configuration. # # Then, I applied a few more transformations to create the new data set All Married Women. [ { "op": "core/column-addition", "engineConfig": { "facets": [], "mode": "row-based" }, "baseColumnName": "First and Middle Name", "expression": "grel:value", "onError": "set-to-blank", "newColumnName": "First and Middle Name 1", "columnInsertIndex": 4, "description": "Create column First and Middle Name 1 at index 4 based on column First and Middle Name using expression grel:value" }, { "op": "core/column-removal", "columnName": "First and Middle Name", "description": "Remove column First and Middle Name" }, { "op": "core/column-rename", "oldColumnName": "Spouse", "newColumnName": "First and Middle Name", "description": "Rename column Spouse to First and Middle Name" }, { "op": "core/column-rename", "oldColumnName": "First and Middle Name 1", "newColumnName": "Spouse", "description": "Rename column First and Middle Name 1 to Spouse" }, { "op": "core/mass-edit", "engineConfig": { "facets": [], "mode": "row-based" }, "columnName": "Title", "expression": "value", "edits": [ { "from": [ "Mr" ], "fromBlank": false, "fromError": false, "to": "Mrs" } ], "description": "Mass edit cells in column Title" }, { "op": "core/mass-edit", "engineConfig": { "facets": [], "mode": "row-based" }, "columnName": "Title", "expression": "value", "edits": [ { "from": [ "Rev" ], "fromBlank": false, "fromError": false, "to": "Mrs" } ], "description": "Mass edit cells in column Title" }, { "op": "core/column-move", "columnName": "First and Middle Name", "index": 5, "description": "Move column First and Middle Name to position 5" }, { "op": "core/column-move", "columnName": "First and Middle Name", "index": 4, "description": "Move column First and Middle Name to position 4" }, { "op": "core/column-move", "columnName": "First and Middle Name", "index": 3, "description": "Move column First and Middle Name to position 3" }, { "op": "core/column-move", "columnName": "Spouse", "index": 5, "description": "Move column Spouse to position 5" }, { "op": "core/column-move", "columnName": "Spouse", "index": 6, "description": "Move column Spouse to position 6" }, { "op": "core/column-addition", "engineConfig": { "facets": [], "mode": "row-based" }, "baseColumnName": "Job", "expression": "grel:null", "onError": "set-to-blank", "newColumnName": "Job 1", "columnInsertIndex": 9, "description": "Create column Job 1 at index 9 based on column Job using expression grel:null" }, { "op": "core/column-removal", "columnName": "Job", "description": "Remove column Job" }, { "op": "core/column-rename", "oldColumnName": "Job 1", "newColumnName": "Job", "description": "Rename column Job 1 to Job" }, { "op": "core/column-addition", "engineConfig": { "facets": [], "mode": "row-based" }, "baseColumnName": "Company", "expression": "grel:null", "onError": "set-to-blank", "newColumnName": "Company 1", "columnInsertIndex": 10, "description": "Create column Company 1 at index 10 based on column Company using expression grel:null" }, { "op": "core/column-removal", "columnName": "Company", "description": "Remove column Company" }, { "op": "core/column-rename", "oldColumnName": "Company 1", "newColumnName": "Company", "description": "Rename column Company 1 to Company" }, { "op": "core/column-addition", "engineConfig": { "facets": [], "mode": "row-based" }, "baseColumnName": "Company Details", "expression": "grel:null", "onError": "set-to-blank", "newColumnName": "Company Details 1", "columnInsertIndex": 11, "description": "Create column Company Details 1 at index 11 based on column Company Details using expression grel:null" }, { "op": "core/column-removal", "columnName": "Company Details", "description": "Remove column Company Details" }, { "op": "core/column-rename", "oldColumnName": "Company Details 1", "newColumnName": "Company Details", "description": "Rename column Company Details 1 to Company Details" } ] # I exported this data set as All Married Women. But there was one wrinkle: The few women who were listed under their own names *and* under their husbands' names now had duplicate entries. In the interest of preserving as much data as possible for each of them, I decided to manually dedupe the data in Excel. I color coded the 205 Title = "Mrs" rows from the original data set and went down the list, combining rows as needed, incorporating several dozen duplicate entries. # # This process meant looking more closely at the data, and in doing so, I noticed a pattern: All of the duplicate entries were for white women, specifically. Could there really be no married Black women listed under their own names in the directory? # # In a later round of deduping, I would learn that a total of six married Black women did in fact have their own entries. But there was no way to discern that from the data set at this point, because none of those six women were listed as "Mrs." In fact, apart from a few presumably-masculine Reverends and the married men listed with their wives, **there were no overt markers of gender or gendered social status for any of the Black citizens in the directory.** # In[2]: get_ipython().run_cell_magic('HTML', '', "

Lack of Data on Black Women's MaritalStatus in the Charlotte 1911 Directory

\n") # # ## Unmarried People and Widows # When I split the original data set into two, I thought the subset with unmarried people and widows would be fairly straightforward to work with. I had planned to apply the same column transformations, populate widowhood status into the title column, ascertain which "wid" entries were widows vs. widowers, and then replace remaining nulls in the Title column with Mr. # # Needless to say, the archival silence of the absent "Miss" in entries for Black women necessitated a change in plans. The original strategy would have categorized only white women as Miss, erroneously labeling all unmarried Black women in the data set as Mr. # # So instead, I took the subset of this data set for which Black = TRUE and Widow = FALSE, and I went through it line by line. I added "Miss" in the Title column for every entry with a feminine name. # # As far as I can discern, there are 1,564 unmarried Black women represented in the Charlotte 1911 directory. But of course, that number isn't certain. Names are an imperfect indicator of gender. Willie, for example, is the first name of 34 white people listed in the directory: 33 women and one man. Are the 10 unmarried Black registrants named Willie all women? I assumed so for the purpose of coding the data, but there's simply no way to be sure. Context can help to fill in the gaps created by this archival silence, but not without assumptions and the risk of error. # # The graph below, created from the complete data set that was the end result of this project, reflects the educated guesses I made in coding this part of the data. Seven people have the title "Unknown" because I wasn't able to make a reasonably confident guess of their gender based on their name. # In[1]: get_ipython().run_cell_magic('HTML', '', "

Charlotte 1911 Directory Data by Title and Race

\n") # Having manually filled in the missing "Miss" designations, I returned to my original plan for this subset of the data. I opened the All Unmarried and Widowed data set in OpenRefine and applied the same column transformations I had used for the All Married Men data set such that this data set would have the same columns in the same order, to facilitate recombining everything later on. Then I faceted out the entries with "wid" and populated the Title column with "Widow." [ { "op": "core/column-split", "engineConfig": { "facets": [], "mode": "row-based" }, "columnName": "Name", "guessCellType": true, "removeOriginalColumn": true, "mode": "separator", "separator": " ", "regex": false, "maxColumns": 2, "description": "Split column Name by separator" }, { "op": "core/column-rename", "oldColumnName": "Name 1", "newColumnName": "Last Name", "description": "Rename column Name 1 to Last Name" }, { "op": "core/column-rename", "oldColumnName": "Name 2", "newColumnName": "First and Middle Name", "description": "Rename column Name 2 to First and Middle Name" }, { "op": "core/fill-down", "engineConfig": { "facets": [ { "type": "list", "name": "Orig-copy", "expression": "grel:value.contains(\"wid \")", "columnName": "Orig-copy", "invert": false, "omitBlank": false, "omitError": false, "selection": [ { "v": { "v": true, "l": "true" } } ], "selectBlank": false, "selectError": false } ], "mode": "row-based" }, "columnName": "Title", "description": "Fill down cells in column Title" }, { "op": "core/mass-edit", "engineConfig": { "facets": [ { "type": "list", "name": "Orig-copy", "expression": "grel:value.contains(\"wid \")", "columnName": "Orig-copy", "invert": false, "omitBlank": false, "omitError": false, "selection": [ { "v": { "v": true, "l": "true" } } ], "selectBlank": false, "selectError": false } ], "mode": "row-based" }, "columnName": "Title", "expression": "value", "edits": [ { "from": [ "" ], "fromBlank": true, "fromError": false, "to": "Widow" } ], "description": "Mass edit cells in column Title" }, { "op": "core/text-transform", "engineConfig": { "facets": [], "mode": "row-based" }, "columnName": "Title", "expression": "join ([coalesce(cells['Title'].value,''),coalesce(cells['Married'].value,'')],'')", "onError": "keep-original", "repeat": false, "repeatCount": 10, "description": "Text transform on cells in column Title using expression join ([coalesce(cells['Title'].value,''),coalesce(cells['Married'].value,'')],'')" }, { "op": "core/column-removal", "columnName": "Married", "description": "Remove column Married" }, { "op": "core/mass-edit", "engineConfig": { "facets": [ { "type": "list", "name": "Orig-copy", "expression": "grel:value.contains(\"wid \")", "columnName": "Orig-copy", "invert": false, "omitBlank": false, "omitError": false, "selection": [ { "v": { "v": false, "l": "false" } } ], "selectBlank": false, "selectError": false } ] # Looking at the Widow facet, I noticed two patterns. First, as with Miss and Mrs, the designation "wid" only appeared in entries for white citizens. # In[3]: get_ipython().run_cell_magic('HTML', '', "

Another Data Limitation: Widowhood by Race

\n") # The second pattern was a little bit more surprising to me, though perhaps it shouldn't have been. I initially assumed that some of the entries with the "wid" designation would be widowers, so I went through the names with the idea of updating "Widow" to "Widower" in the Title column wherever applicable. However, I found that all of the "wid" entries appeared to be for women - judging once again by the imperfect indicator of gendered names. # # This archival silence is perhaps less egregious and less impactful on the data than the others we've discussed, but it is nevertheless very revealing about the social structure of Charlotte in 1911. Even in death, white men conferred a certain social standing to their wives. The fact that only white men are memorialized in the directory in this way speaks to the racial and gendered hierarchies of the society that produced this primary source. # In[4]: get_ipython().run_cell_magic('HTML', '', "

\n") # ## The Final, But Not Necessarily Complete, Data Set # Having populated the Title column for every entry in the All Unmarried and Widowed data set, I had three subsets ready to be recombined into one large data set. I made a few more additions to each in OpenRefine - a column for Gender, a column for Housing Type, and a column indicating whether business address and home address were the same - and then brought it all back together. # # [FullFinalDataSet.xlsx](./FullFinalDataSet.xlsx) # # With the business directory rows eliminated and rows for married women added, my final data set describes 20,554 people. As such, it is a more complete picture of the population of 1911 Charlotte than the original data set was. However, it reifies assumptions that may not accurately represent the facts, and even with the guesses filled in, it undoubtedly still has gaps. Why do only some of the women listed under their own names as Mrs. have corresponding entries for their husbands? How did the original compilers of the directory gather data? Who might they have accidentally or deliberately omitted, and why? # # The archival silences present in this primary source give information, by their very existence, about the racial and gendered hierarchies that pervaded the society described by this source. They also highlight the importance of triangulating multiple data sets and approaching *any* historical data analysis with a firm grounding in the data's historical context.