Finally, it helps us gain some perspective on the population of the camps and this history of camp "incidents" to look at events over time.
We will start out by importing the libraries and datasets again.
import pandas as pd
# reading in all of the data
data_Card = pd.read_csv( "Datasets/Cards_Box9.csv" )
data_Form26 = pd.read_csv("Datasets/WRAForm26.csv")
data_FAR = pd.read_csv("Datasets/TuleLake_FAR_ALL_FINAL4.csv")
We are going to use a Python library, Matplotlib, to create some bar-charts. The bar()
function takes two important arguments.
First you supply a list of X-axis values, then a list of Y-axis values. Each bar is drawn form the corresponding point on the X-axis, up to
the respective value in the Y-dimension. The incident card dataset makes for a basic example, since it only have a few years of data.
In the code below we make a sorted list of the unique years that are seen on the incident cards (unique_years).
Then we ask Pandas to count the number of cards we have for each unique year, via the .value_counts()
function.
Lastly, we use a Python trick called list comprehension to make a new list that returns the cards per year, ordered
by the previously sorted list of unique years. Python list comprehensions are powerful tools, but what they do is straightforward,
which is to make a new list from an old one, replacing or dropping some data on the way, as needed.
%matplotlib inline
import matplotlib.pyplot as plt
unique_years = data_Card['Year'].unique()
unique_years = sorted( unique_years )
cards_per_year = data_Card['Year'].value_counts()
print(cards_per_year)
by_year = [ cards_per_year[year] for year in unique_years ]
plt.xlabel( 'Card Year' )
plt.ylabel( 'Incident Counts' )
plt.title( 'Number of Incident Cards / Year in Box9' )
plt.bar( unique_years, by_year )
plt.show
1943 85 1944 14 1942 14 Name: Year, dtype: int64
<function matplotlib.pyplot.show(*args, **kw)>
%matplotlib inline
import matplotlib.pyplot as plt
dfu = data_Form26['BirthYear'].unique() # get unique years in dataset
dfus = sorted( dfu ) # sort the list of years
dfvc = data_Form26['BirthYear'].value_counts() # get number of rows for each unique year
by_year = [ dfvc[year] for year in dfus ] # generate a list of rows per year, using the sorted list from before
plt.xlabel( 'Form26 BirthDates' )
plt.ylabel( 'Form26 Birthday Counts' )
plt.title( 'Number of Form26 Birthdays / Year' )
plt.bar( dfus, by_year ) # create a bar chart of birth events per year
plt.show()
%matplotlib inline
import matplotlib.pyplot as plt
def get_year(elem):
return elem[0]
dfu = data_FAR['BirthYear'].unique()
dfus = sorted( dfu )
dfvc = data_FAR['BirthYear'].value_counts()
by_year = [ dfvc[year] for year in dfus ]
plt.xlabel( 'FAR BirthDates' )
plt.ylabel( 'FAR Birthday Counts' )
plt.title( 'Number of FAR Birthdays / Year' )
plt.bar( dfus, by_year )
plt.show()
Questions: