Import the English Language Model¶

If you have not already done so, you will need to run this code to download the language model.

In [5]:

import sys
!{sys.executable} -m spacy download en_core_web_sm

Requirement already satisfied: en_core_web_sm==2.2.5 from https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.2.5/en_core_web_sm-2.2.5.tar.gz#egg=en_core_web_sm==2.2.5 in /home/jansen/.local/share/virtualenvs/bdarchives-nlp-worlW0cl/lib/python3.6/site-packages (2.2.5)
Requirement already satisfied: spacy>=2.2.2 in /home/jansen/.local/share/virtualenvs/bdarchives-nlp-worlW0cl/lib/python3.6/site-packages (from en_core_web_sm==2.2.5) (2.2.4)
Requirement already satisfied: tqdm<5.0.0,>=4.38.0 in /home/jansen/.local/share/virtualenvs/bdarchives-nlp-worlW0cl/lib/python3.6/site-packages (from spacy>=2.2.2->en_core_web_sm==2.2.5) (4.44.1)
Requirement already satisfied: setuptools in /home/jansen/.local/share/virtualenvs/bdarchives-nlp-worlW0cl/lib/python3.6/site-packages (from spacy>=2.2.2->en_core_web_sm==2.2.5) (46.1.3)
Requirement already satisfied: requests<3.0.0,>=2.13.0 in /home/jansen/.local/share/virtualenvs/bdarchives-nlp-worlW0cl/lib/python3.6/site-packages (from spacy>=2.2.2->en_core_web_sm==2.2.5) (2.23.0)
Requirement already satisfied: thinc==7.4.0 in /home/jansen/.local/share/virtualenvs/bdarchives-nlp-worlW0cl/lib/python3.6/site-packages (from spacy>=2.2.2->en_core_web_sm==2.2.5) (7.4.0)
Requirement already satisfied: blis<0.5.0,>=0.4.0 in /home/jansen/.local/share/virtualenvs/bdarchives-nlp-worlW0cl/lib/python3.6/site-packages (from spacy>=2.2.2->en_core_web_sm==2.2.5) (0.4.1)
Requirement already satisfied: wasabi<1.1.0,>=0.4.0 in /home/jansen/.local/share/virtualenvs/bdarchives-nlp-worlW0cl/lib/python3.6/site-packages (from spacy>=2.2.2->en_core_web_sm==2.2.5) (0.6.0)
Requirement already satisfied: catalogue<1.1.0,>=0.0.7 in /home/jansen/.local/share/virtualenvs/bdarchives-nlp-worlW0cl/lib/python3.6/site-packages (from spacy>=2.2.2->en_core_web_sm==2.2.5) (1.0.0)
Requirement already satisfied: plac<1.2.0,>=0.9.6 in /home/jansen/.local/share/virtualenvs/bdarchives-nlp-worlW0cl/lib/python3.6/site-packages (from spacy>=2.2.2->en_core_web_sm==2.2.5) (1.1.3)
Requirement already satisfied: murmurhash<1.1.0,>=0.28.0 in /home/jansen/.local/share/virtualenvs/bdarchives-nlp-worlW0cl/lib/python3.6/site-packages (from spacy>=2.2.2->en_core_web_sm==2.2.5) (1.0.2)
Requirement already satisfied: preshed<3.1.0,>=3.0.2 in /home/jansen/.local/share/virtualenvs/bdarchives-nlp-worlW0cl/lib/python3.6/site-packages (from spacy>=2.2.2->en_core_web_sm==2.2.5) (3.0.2)
Requirement already satisfied: cymem<2.1.0,>=2.0.2 in /home/jansen/.local/share/virtualenvs/bdarchives-nlp-worlW0cl/lib/python3.6/site-packages (from spacy>=2.2.2->en_core_web_sm==2.2.5) (2.0.3)
Requirement already satisfied: srsly<1.1.0,>=1.0.2 in /home/jansen/.local/share/virtualenvs/bdarchives-nlp-worlW0cl/lib/python3.6/site-packages (from spacy>=2.2.2->en_core_web_sm==2.2.5) (1.0.2)
Requirement already satisfied: numpy>=1.15.0 in /home/jansen/.local/share/virtualenvs/bdarchives-nlp-worlW0cl/lib/python3.6/site-packages (from spacy>=2.2.2->en_core_web_sm==2.2.5) (1.18.2)
Requirement already satisfied: idna<3,>=2.5 in /home/jansen/.local/share/virtualenvs/bdarchives-nlp-worlW0cl/lib/python3.6/site-packages (from requests<3.0.0,>=2.13.0->spacy>=2.2.2->en_core_web_sm==2.2.5) (2.9)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /home/jansen/.local/share/virtualenvs/bdarchives-nlp-worlW0cl/lib/python3.6/site-packages (from requests<3.0.0,>=2.13.0->spacy>=2.2.2->en_core_web_sm==2.2.5) (1.25.8)
Requirement already satisfied: certifi>=2017.4.17 in /home/jansen/.local/share/virtualenvs/bdarchives-nlp-worlW0cl/lib/python3.6/site-packages (from requests<3.0.0,>=2.13.0->spacy>=2.2.2->en_core_web_sm==2.2.5) (2019.11.28)
Requirement already satisfied: chardet<4,>=3.0.2 in /home/jansen/.local/share/virtualenvs/bdarchives-nlp-worlW0cl/lib/python3.6/site-packages (from requests<3.0.0,>=2.13.0->spacy>=2.2.2->en_core_web_sm==2.2.5) (3.0.4)
Requirement already satisfied: importlib-metadata>=0.20; python_version < "3.8" in /home/jansen/.local/share/virtualenvs/bdarchives-nlp-worlW0cl/lib/python3.6/site-packages (from catalogue<1.1.0,>=0.0.7->spacy>=2.2.2->en_core_web_sm==2.2.5) (1.6.0)
Requirement already satisfied: zipp>=0.5 in /home/jansen/.local/share/virtualenvs/bdarchives-nlp-worlW0cl/lib/python3.6/site-packages (from importlib-metadata>=0.20; python_version < "3.8"->catalogue<1.1.0,>=0.0.7->spacy>=2.2.2->en_core_web_sm==2.2.5) (3.1.0)
✔ Download and installation successful
You can now load the model via spacy.load('en_core_web_sm')

Defining variables¶

In [1]:

## define directory path and entity type
import os
cwd = os.getcwd()
data_loc = cwd + "/data"
output_loc = cwd + "/output/"
ent_type = "PERSON"

### entity type can be "PERSON", "NORP", "ORG", "GPE", etc.
### https://spacy.io/api/annotation#named-entities

Imports and setup¶

In [2]:

import spacy
from spacy import displacy
import os
import string
import codecs
import subprocess
from collections import Counter

nlp = spacy.load('en_core_web_sm')

Walk the directory and collect text files¶

In [3]:

allfiles = []

for root, dirs, files in os.walk(data_loc):
    for file in files:
        if file.endswith(".txt"):
            allfiles.append(os.path.join(root, file))
            
print('files: %d ' % len(allfiles))

files: 4

In [4]:

myfile = codecs.open(allfiles[0], 'r', encoding='utf-8')
pagetext=myfile.read()
myfile.close()

First pass: Parse the text and recognize entities¶

Here we apply the plain, "out of the box" Spacy English model to our text document. We then display the first sentence as a dependency graph and the entire document with highlighted entities.

In [5]:

def parse():
    doc = nlp(pagetext)
    sentence_spans = list(doc.sents)
    displacy.render(sentence_spans[0:1], options={'compact': True}, style="dep")
    displacy.render(doc, options={'compact': True}, style="ent")
    

In [6]:

parse()

White Adams Henry L Fannie S route agent Southern Railway Co House 327 CARDINAL n Tryon
White Adams James Gertrude manager House ORG 419 Elizabeth PERSON av
Black Adams Jane EVENT teacher House ORG 1021 s Church ORG Black Adams John laborer House ORG 1031 s Church ORG
White Adams John J president Adams G & P Co and Char Pepsi-Cola Co House 309 CARDINAL e 6th ORDINAL
White Adams John W Cora conductor S A L Railway House 21st nr Caldwell ORG Black Adams Joseph Violet cooper House ORG 1011 s Church ORG White Adams Rev Joseph PERSON Q Leslie House 1509 s Boulevard
White Adams Miss Julia M House ORG 707 n Church ORG Black Adams Kate PERSON laundress House Groveton
White Adams Lafayette N clerk Southern Railway House 327 n Tryon
White Adams Lawrence A salesman B S Moore & Co Rooms 405 s Tryon
White Adams Laurie A Margaret N mill head House ORG ElizabetMills
Black Adams EVENT Leland porter J P Stowe & Co House 403 s Myers
Black Adams Lizzie House ORG 309 1/2 w Morehead
White Adams ORG Miss Lula clerk Belk Bros Rooms Y W C A
White Adams wid ( Geo O PRODUCT ) Grace M House ORG 601 CARDINAL n College ORG
White Adams Luther M Mamie House Groveton
Black Adams EVENT Major waiter Buford Hotel
Black Adams Mattie House ORG 719 n Graham PERSON ext
White Adams EVENT Miss Pattie V stenographer Boards 708 EVENT n Caldwell ORG Black Adams Reuben Belle PERSON laborer Y & B Co House 419 CARDINAL w 2d
Black Adams Rosa House 714 s Caldwell ORG Black Adams Rufus laborer House ORG Greenville
Black Adams EVENT Rufus driver Stand I & F Co House Ross Town
White Adams EVENT Miss Salie H assistant Carnegie Library House 707 CARDINAL n Church ORG Black Adams Violet servant 508 CARDINAL w Trade
White Adams Wheeler F Mamie moulder House ORG 303 s Cedar PRODUCT
White Adams William E The Chronicle Rooms ORG 300 1/2 CARDINAL s Church ORG
White Adcock John F mill worker House ORG 916 Calvine av
White WORK_OF_ART Adcock wid ( Jas M PERSON ) Millie M House ORG 916 Calvine av
White Adelsheimer Henry S Lizzie mill worker House ORG 1216 Louise PERSON av
Black Adkins King WORK_OF_ART laborer 600 CARDINAL s Myers
White DATE Adkins Walter D PERSON ( Leona E PERSON ) Walter D Leona E lineman House ORG (r) 305 e 13th TIME
Black Agers WORK_OF_ART Nancy cook House ORG 206 Wilson
Black Agers WORK_OF_ART Sallie laundress House ORG 420 Jackson
White Ahaus Herman Frances PERSON E tailor 203 w 4th ORDINAL House 204 CARDINAL s Church ORG White Aikel Joseph confectioner 317 e Trade Rooms 225 CARDINAL w Trade
White Aiken George W M Barbara PERSON superintendent Queen City M & G Wks House 1120 s Caldwell ORG White Aiken Henry Rooms 9 CARDINAL e 3d CARDINAL
Black Aiken Walter Ella PERSON laborer House ORG 600 CARDINAL e 2d CARDINAL

Student Exercise¶

Analyze the results obtained above. How accurate are the entities that are recognized. Can you point out any reasons why certain mistakes were made by the "out of the box" model?

Create Line-by-Line Sentence Boundaries¶

Our directory text files contain one group of related words per line, but they aren't exactly sentences. Let's see if we can improve the NLP output by telling the pipeline that each line is a sentence of related words. The code below creates a function 'set_newline_sentences', which is added to our NLP pipeline.

Newline and Escape Characters¶

The newline character in text-encoded files that is only indirectly visible. It causes the character after it to jump to the next when the file is printed or displayed in an editor or viewer. In programming languages you often need to create a newline character within a string, without typing a literal line-break. Instead we use an "escape code" to add the invisible character. Newline's escape code is '\n'. String escape code in most programming languages start with a '', for instance a tab character is created by placing '\t' in a string.

In [7]:

def set_newline_sentences(doc):
    for token in doc[:-1]:
        if token.text == "\n":
            doc[token.i+1].is_sent_start = True
        elif doc[token.i].is_sent_start is None:
            doc[token.i].is_sent_start = False
    return doc

nlp = spacy.load('en_core_web_sm')
nlp.add_pipe(set_newline_sentences, before="parser")

In [8]:

parse()

White Adams Henry L Fannie S route agent Southern Railway Co House 327 CARDINAL n Tryon
White Adams James Gertrude manager House ORG 419 Elizabeth PERSON av
Black Adams Jane EVENT teacher House ORG 1021 s Church ORG Black Adams John laborer House ORG 1031 s Church ORG
White Adams John J president Adams G & P Co and Char Pepsi-Cola Co ORG House 309 CARDINAL e 6th ORDINAL
White Adams John W Cora conductor S A L Railway House 21st nr Caldwell ORG Black Adams Joseph Violet cooper House ORG 1011 s Church ORG White Adams Rev Joseph PERSON Q Leslie House 1509 s Boulevard
White Adams Miss Julia M House ORG 707 n Church ORG Black Adams Kate PERSON laundress House Groveton
White Adams Lafayette N clerk Southern Railway House 327 n Tryon
White Adams Lawrence A salesman B S Moore & Co Rooms 405 s Tryon
White Adams Laurie A Margaret N mill head House ORG ElizabetMills
Black Adams Leland EVENT porter J P Stowe & Co House 403 s Myers
Black Adams Lizzie House ORG 309 1/2 w Morehead
White Adams ORG Miss Lula clerk Belk Bros Rooms Y W C A
White Adams wid ( Geo O PRODUCT ) Grace M House ORG 601 CARDINAL n College ORG
White Adams Luther M Mamie House Groveton
Black Adams Major EVENT waiter Buford Hotel
Black Adams Mattie House ORG 719 n Graham PERSON ext
White Adams EVENT Miss Pattie V stenographer Boards 708 EVENT n Caldwell ORG Black Adams Reuben Belle PERSON laborer Y & B Co House 419 CARDINAL w 2d
Black Adams Rosa House 714 s Caldwell ORG Black Adams Rufus laborer House ORG Greenville
Black Adams Rufus EVENT driver Stand I & F Co House Ross Town
White Adams EVENT Miss Salie H assistant Carnegie Library House 707 CARDINAL n Church ORG Black Adams Violet servant 508 CARDINAL w Trade
White Adams Wheeler F Mamie moulder House ORG 303 s Cedar PRODUCT
White Adams William E The Chronicle Rooms ORG 300 1/2 CARDINAL s Church ORG
White Adcock John F mill worker House ORG 916 Calvine av
White Adcock WORK_OF_ART wid ( Jas M PERSON ) Millie M House ORG 916 Calvine av
White Adelsheimer Henry S Lizzie mill worker House ORG 1216 Louise PERSON av
Black Adkins King laborer WORK_OF_ART 600 CARDINAL s Myers
White DATE Adkins Walter D PERSON ( Leona E PERSON ) Walter D Leona E lineman House ORG (r) 305 e 13th TIME
Black Agers WORK_OF_ART Nancy cook House ORG 206 Wilson
Black Agers Sallie WORK_OF_ART laundress House ORG 420 Jackson
White Ahaus Herman Frances PERSON E tailor 203 w 4th ORDINAL House 204 CARDINAL s Church ORG White Aikel Joseph confectioner 317 e Trade Rooms 225 CARDINAL w Trade
White Aiken George W M Barbara PERSON superintendent Queen City M & G Wks House 1120 s Caldwell ORG White Aiken Henry Rooms 9 CARDINAL e 3d CARDINAL
Black Aiken Walter Ella PERSON laborer House ORG 600 CARDINAL e 2d CARDINAL

In [13]:

from spacy.pipeline import EntityRuler
race_entities = EntityRuler(nlp)
patterns = [{"label": "RACE", "pattern": [{"LOWER": "black"},]},
            {"label": "RACE", "pattern": [{"LOWER": "white"},]}]
race_entities.add_patterns(patterns)

nlp = spacy.load('en_core_web_sm')
nlp.entity.add_label('RACE')
nlp.add_pipe(set_newline_sentences, before="parser")
nlp.add_pipe(race_entities, before="ner")

In [14]:

parse()

White RACE Adams Henry L Fannie S route agent Southern Railway Co House 327 n Tryon
White RACE Adams James PERSON Gertrude manager House ORG 419 Elizabeth PERSON av
Black RACE Adams Jane PERSON teacher House ORG 1021 s Church ORG Black RACE Adams John laborer House ORG 1031 s Church ORG
White RACE Adams John J president Adams G & P Co and Char Pepsi-Cola Co ORG House 309 CARDINAL e 6th ORDINAL
White RACE Adams John W Cora conductor S A L Railway House 21st nr Caldwell ORG Black RACE Adams Joseph PERSON Violet cooper House ORG 1011 s Church ORG White RACE Adams Rev Joseph PERSON Q Leslie House 1509 s Boulevard
White RACE Adams Miss Julia M House ORG 707 n Church ORG Black RACE Adams Kate PERSON laundress House Groveton
White RACE Adams Lafayette N clerk Southern Railway House 327 n Tryon
White RACE Adams Lawrence A salesman B S Moore & Co Rooms 405 s Tryon
White RACE Adams Laurie A Margaret N mill head House ORG ElizabetMills
Black RACE Adams Leland porter J P Stowe & Co House 403 s Myers
Black RACE Adams Lizzie House ORG 309 1/2 w Morehead
White RACE Adams Miss Lula clerk Belk Bros Rooms Y W C A
White RACE Adams wid ( Geo O PRODUCT ) Grace M House ORG 601 CARDINAL n College ORG
White RACE Adams Luther M Mamie House Groveton
Black RACE Adams Major waiter Buford Hotel
Black RACE Adams Mattie House ORG 719 n Graham PERSON ext
White RACE Adams Miss Pattie V stenographer Boards 708 EVENT n Caldwell ORG Black RACE Adams Reuben Belle PERSON laborer Y & B Co House 419 CARDINAL w 2d
Black RACE Adams Rosa House 714 s Caldwell ORG Black RACE Adams Rufus laborer House ORG Greenville
Black RACE Adams Rufus driver Stand I & F Co House Ross Town
White RACE Adams Miss Salie H assistant Carnegie Library House 707 CARDINAL n Church ORG Black RACE Adams Violet servant 508 CARDINAL w Trade
White RACE Adams Wheeler F Mamie PERSON moulder House 303 s Cedar PRODUCT
White RACE Adams William E The Chronicle Rooms WORK_OF_ART 300 1/2 CARDINAL s Church ORG
White RACE Adcock John F mill worker House ORG 916 Calvine av
White RACE Adcock wid ( Jas M PERSON ) Millie M House ORG 916 Calvine av
White RACE Adelsheimer Henry S Lizzie mill worker House ORG 1216 Louise PERSON av
Black RACE Adkins King laborer 600 CARDINAL s Myers
White RACE Adkins Walter D PERSON ( Leona E PERSON ) Walter D Leona E lineman House ORG (r) 305 e 13th TIME
Black RACE Agers Nancy cook House ORG 206 Wilson
Black RACE Agers Sallie laundress House ORG 420 Jackson
White RACE Ahaus Herman Frances PERSON E tailor 203 w 4th ORDINAL House 204 CARDINAL s Church ORG White RACE Aikel Joseph confectioner 317 e Trade Rooms 225 w Trade
White RACE Aiken George W M Barbara PERSON superintendent Queen City M & G Wks House 1120 s Caldwell ORG White RACE Aiken Henry Rooms 9 CARDINAL e 3d CARDINAL
Black RACE Aiken Walter PERSON Ella PERSON laborer House ORG 600 CARDINAL e 2d CARDINAL

In [28]:

from spacy.tokens import Span
def lastname_follows_race_entities(doc):
    new_ents = []
    for ent in doc.ents:
        new_ents.append(ent)
        if ent.label_ == "RACE":
            next_token = doc[ent.end].nbor()
            new_ent = Span(doc, next_token.i, next_token.i + 1, label="PERSON")
            new_ents.append(new_ent)
    doc.ents = new_ents
    return doc

nlp = spacy.load('en_core_web_sm')
nlp.add_pipe(set_newline_sentences, name="newline", before="parser")
nlp.entity.add_label('RACE')
nlp.add_pipe(race_entities, name="race", before="ner")
nlp.add_pipe(lastname_follows_race_entities, name="lastname", after='race')

In [29]:

parse()

White RACE Adams PERSON Henry L Fannie S route agent Southern Railway Co House 327 n Tryon
White RACE Adams PERSON James Gertrude manager House ORG 419 Elizabeth PERSON av
Black RACE Adams PERSON Jane teacher House ORG 1021 s Church ORG Black RACE Adams PERSON John laborer House ORG 1031 s Church ORG
White RACE Adams PERSON John J president Adams G & P Co and Char Pepsi-Cola Co ORG House 309 CARDINAL e 6th ORDINAL
White RACE Adams PERSON John W Cora conductor S A L Railway ORG House 21st nr Caldwell ORG Black RACE Adams PERSON Joseph Violet NORP cooper House ORG 1011 s Church ORG White RACE Adams PERSON Rev Joseph PERSON Q Leslie House 1509 s Boulevard
White RACE Adams PERSON Miss Julia M House ORG 707 n Church ORG Black RACE Adams PERSON Kate PERSON laundress House Groveton
White RACE Adams PERSON Lafayette N clerk Southern Railway House 327 n Tryon
White RACE Adams PERSON Lawrence A salesman B S Moore & Co Rooms 405 s Tryon PRODUCT
White RACE Adams PERSON Laurie A Margaret N mill head House ORG ElizabetMills
Black RACE Adams PERSON Leland porter J P Stowe & Co House 403 s Myers ORG Black RACE Adams PERSON Lizzie House 309 1/2 CARDINAL w Morehead
White RACE Adams PERSON Miss Lula clerk Belk Bros Rooms Y W C A
White RACE Adams PERSON wid ( Geo O PRODUCT ) Grace M House ORG 601 CARDINAL n College ORG
White RACE Adams PERSON Luther M Mamie House ORG Groveton
Black RACE Adams PERSON Major waiter Buford Hotel
Black RACE Adams PERSON Mattie House 719 n Graham PERSON ext
White RACE Adams PERSON Miss Pattie V stenographer Boards 708 EVENT n Caldwell ORG Black RACE Adams PERSON Reuben Belle PERSON laborer Y & B Co House 419 CARDINAL w 2d
Black RACE Adams PERSON Rosa House 714 s Caldwell ORG Black RACE Adams PERSON Rufus laborer House ORG Greenville
Black RACE Adams PERSON Rufus driver Stand I & F Co House Ross Town
White RACE Adams PERSON Miss Salie H assistant Carnegie Library House 707 n Church ORG Black RACE Adams PERSON Violet servant 508 CARDINAL w Trade
White RACE Adams PERSON Wheeler F Mamie PERSON moulder House 303 s Cedar PRODUCT
White RACE Adams PERSON William E The Chronicle Rooms ORG 300 1/2 CARDINAL s Church ORG
White RACE Adcock PERSON John F PERSON mill worker House ORG 916 Calvine av
White RACE Adcock PERSON wid (Jas M) Millie M House 916 Calvine av
White RACE Adelsheimer PERSON Henry S Lizzie mill worker House ORG 1216 Louise PERSON av
Black RACE Adkins PERSON King laborer 600 CARDINAL s Myers
White RACE Adkins PERSON Walter D (Leona E) Walter D Leona E lineman House ORG (r) 305 e 13th TIME
Black RACE Agers PERSON Nancy cook House 206 Wilson
Black RACE Agers PERSON Sallie laundress House ORG 420 Jackson
White RACE Ahaus PERSON Herman Frances PERSON E tailor 203 w 4th ORDINAL House 204 CARDINAL s Church ORG White RACE Aikel PERSON Joseph confectioner 317 e Trade Rooms 225 w Trade
White RACE Aiken PERSON George W M Barbara PERSON superintendent Queen City M & G Wks House 1120 s Caldwell ORG White RACE Aiken PERSON Henry Rooms 9 e 3d
Black RACE Aiken PERSON Walter Ella PERSON laborer House ORG 600 CARDINAL e 2d CARDINAL

Return top entities¶

In [26]:

os.makedirs(output_loc)
os.chdir(output_loc)


namecount = Counter(filter_entlist)
fullnamecount = Counter(filter_entlist2)
commonnames = [x for x in fullnamecount.most_common() if x[1] > 5]
commonall = [x for x in namecount.most_common() if x[1] > 5]

entities_table = []

for name in commonnames:
    row = [(name[0])[0].encode('utf-8'), name[1]]
    entities_table.append(row)

out_path = "entities_fullnames.csv"

header = ['Name', 'Frequency']

with open(out_path, 'w') as fo:
    csv_writer = csv.writer(fo)
    csv_writer.writerow(header)
    csv_writer.writerows(entities_table)
    
entities_table2 = []

for name in commonall:
    row = [(name[0])[0].encode('utf-8'), name[1]]
    entities_table2.append(row)

out_path = "names_all.csv"

header = ['Name', 'Frequency']

with open(out_path, 'w') as fo:
    csv_writer = csv.writer(fo)
    csv_writer.writerow(header)
    csv_writer.writerows(entities_table2)

In [ ]: