Notebook

Data Cleaning and Preparation¶

After obtaining the data, we first need to clean it to prepare for data analysis. This may include handling very similiar, but not identical text values and making sure that each column of interest contains a single attribute.

This notebook should first highlight each data cleaning issue by showing examples found in the source data. Then it should show and explain the computational steps for resolving each cleaning issue.

In [ ]:

# Loading the Data
import pandas as pd

# Read the CSV file into a Pandas data frame:
df = pd.read_csv("mydata.csv")

# Show the first three rows
df.head(n=3)

After you run the cell above, you will see the first three rows printed out. Discuss all the columns, their data types and ranges:

column 1 - Form, string
column 2 - State, string
column 3 - City, string
column 4 - Security_Grade, character
column 5 - Area_Number, int

In [ ]: