The data I used was a variation of the consolidated 1911 Charlotte dataset developed by previous INST 742 students, Cliff Morris and Matt LaRoche. They were able to create a series of scripts that georeferenced the longitude and latitude coordinates of every valid address in the data and plotted them on a map, resulting in a dataset with over 5,000 entries ready to be used in a GIS program. While the coding is beyond my capabilities, I highly recommend looking at their notebook here.
Before uploading the csv file, I performed some data cleanup in Open Refine.
I filtered out the entries that were unable to be georeferenced by creating a custom fact that checked for whether latitude values contained 0.0, indicating that coordinates could not be generated for that address. QGIS ignores values that don't have coordinates, but I figured it would be easier to just remove the values I wasn't going to use.
From the screenshot below, you can see that 5,642 addresses had coordinates, whereas 10,068 did not. Because I'm only working with 2/3 of the entire dataset, any conclusions drawn should be taken with a grain of salt.
Then, I created facets to filter by race and whether the entry was an individual or a business.
Ultimately, I ended up with four smaller datasets alongside the original, all of which I uploaded into QGIS:
I ended up not using the business data, but I wanted to include it here regardless.
This is the HOLC redline map downloaded from the Mapping Inqeuality website. They also provide shape files for each of the areas shown on this map, which were very useful when visualizing everything.