The team of the RickyRenuncia Project managed multiple adquicision procedures to preserve the incidents that occured during the summer 2019 related to the leave of office of Ex-governor Ricardo Rosello Nevarez.
The team collected artifacts and bannes used during the demonstrations. When ever possible the artifacts where accompanied by audio interview and/or photograph of the demonstrators that produced and used this artifacts.
Using social media and word-of-mouth the team also contacted the community requesting imagery and content related to the activities of that summer.
In order to have a broad view of the many activities and demonstratiosn around the globe, one of the team members, Joel Blanco, decided to capture records of tweet activity in the web. This data was captured life during the days of the incident and requires processing and analysis to provide a valid interpretation of the information adquired.
A cleaned version of this dataset occupies over 7 gigabytes but fits into 777 megabytes when compressed using gzip. Full text data can generally be easily compressed. Bellow we calculate the benefit of compressing this specific dataset.
# Calculare the storage benefits of compression
# Observations
original_size_G = 7
final_size_M = 777
# Unit transformation
giga_to_mega_rate = 1024.0
original_size_M = original_size_G * giga_to_mega_rate
# Calculate percent change
new_size_to_old_size = final_size_M / original_size_M
new_size_percent = new_size_to_old_size * 100.0
space_freed_percent = 100 - new_size_percent
print(
"The storage was reduced to {:.1f}%.\nAfter compression, {:.1f}% from the originaly occupied space was freed.".
format(new_size_percent, space_freed_percent)
)
The storage was reduced to 10.8%. After compression, 89.2% from the originaly occupied space was freed.
The benefits can be very big specially for long term storage.
It is important to understand the type of data that is collected from a social media API (application programable interface). The file Data/Joel/tweetsRickyRenuncia-final.jsonl is of jsonl format. If you are familiar with json files then this format is a composition of multiple json strings each in a new line, the 'L' stands for line (jsonl = json-l = json-line).
This data set was collected from Twitter in 2019. The Twitter API rescently went through an update, however this data uses the previous API conventions. We will use Pythons json library to parse a random line from the source data to help you visualize the structure of this data. Observe that some of the content is readily availble (text field), while others are harder to parse (media url).
The full list of tweet ids is available here.
Bellow we show how a try/except and while loops can be used to loop through the data until a post with images is found.
import json
from random import seed, randint
import os
dir_path = os.getcwd()
print(dir_path)
#os.chdir("/home/torrien/")
#dir_path = os.getcwd()
#print(dir_path)
#print(os.listdir())
#print(os.listfile())
JL_DATA="/home/rickyrenuncia/tweetsRickyRenuncia-final.jsonl"
# Get the SAMPLE_SIZE
SAMPLE_SIZE = 0.
with open(JL_DATA, "r") as data_handler:
for line in data_handler:
if line != "\n":
SAMPLE_SIZE += 1.
print(f"Sample Size:{int(SAMPLE_SIZE)}\n\n")
# Get a random integer to skip before taking single sample
# Try seeds 1 and 16 or any you want to test
seed(1)
skip_lines=randint(0,int(SAMPLE_SIZE-1))
# Reopen file using the with-open-as style and print out a single sample
with open(JL_DATA, 'r') as data_handler:
# Use next to skip a line, the for loop allows skipping multiple lines
for _ in range(skip_lines):
next(data_handler)
while True:
# Loop until a tweet with media.
try:
# Capture string
raw_data = data_handler.readline()
# Verify if the json has any 'meda_url_https' keys.
if 'media_url_https' not in raw_data:
continue
data = json.loads(raw_data)
except:
break
try:
i = 0
while True:
try:
media_url = data['retweeted_status']['entities']['media'][i]['media_url_https']
except:
i += 1
if i > 10:
media_url = "Could not quickly find a tweet with media."
raise #Pass error to previous try/except.
continue
break
except:
continue
print("Text:", data['text'])
# The Tweet URL is a twitter convention where both the tweet ID and the user's screen_name are required to access the status.
print("Tweet URL using user's screen_name:", f"https://twitter.com/{data['user']['screen_name']}/status/{data['id_str']}")
print("Tweet URL using user's ID :", f"https://twitter.com/{data['user']['id_str']}/status/{data['id_str']}")
print("Media:", media_url)
# print(f"In replay to: {json.dumps(data['retweeted_status'], indent=1)}")
print("\n")
# The indent and sort_keys in json.dumps "prettify" the output. Still not pretty.
# print("Raw Data:")
# print("#"*50)
# print(json.dumps(data, indent=4, sort_keys=True))
# print("#"*50)
break
/home/rickyrenuncia/RickyRenuncia-case-module_shared Sample Size:1113758 Text: RT @vicgasco: "Somos tan buenos que cogemos a los nuestros de pendejos" -Ricardo Rosselló #RickyRenuncia #RickyRenunciaYa https://t.co/isIw… Tweet URL using user's screen_name: https://twitter.com/frances_sola/status/1151268971557134337 Tweet URL using user's ID : https://twitter.com/3299053100/status/1151268971557134337 Media: https://pbs.twimg.com/media/D_oenfmXoAUuieN.jpg
As data analysts we need to understand the data before we can set goals.
from tweet_rehydrate.analysis import TweetJLAnalyzer, TweetAnalyzer, getsizeof
from random import randint
JL_DATA="/home/rickyrenuncia/tweetsRickyRenuncia-final.jsonl"
SAMPLE_SIZE = 1113758
data = TweetJLAnalyzer(JL_DATA, reset=True, local_media=False, cache_size=2000)
size=getsizeof(data)
print(str(size))
print(str(size/1024.0))
/home/rickyrenuncia/.multiple_sorts_tweetsRickyRenuncia-final.jsonl.pkl Creating Lists /home/rickyrenuncia/tweetsRickyRenuncia-final.jsonl Finished: 200000 Finished: 400000 Finished: 600000 Finished: 800000 Finished: 1000000 Expecting value: line 1 column 1 (char 0) Could not create tweet Last line processed = 1113758 48 0.046875
most_retweeted_media = data.get_most_retweeted_media(40)
print("Ammount found: ", len(most_retweeted_media))
for rt_count, m_id, m in most_retweeted_media[15:21]:
print(m)
print("*"*20 + "\n" + str(rt_count) + " - " + str(m_id) + "\n" + "*"*20 + "\n\n")
Ammount found: 42 PHOTO: https://pbs.twimg.com/ext_tw_video_thumb/1151738583000264704/pu/img/M_mUM7-RiqQHIFwB.jpg ******************** 112370 - 1151738583000264704 ******************** PHOTO: https://pbs.twimg.com/ext_tw_video_thumb/1151475207007326208/pu/img/upRlenku-1oo84bw.jpg ******************** 110941 - 1151475207007326208 ******************** PHOTO: https://pbs.twimg.com/ext_tw_video_thumb/1151645615660699659/pu/img/IRu1lKdxKDEusUcg.jpg ******************** 103496 - 1151645615660699659 ******************** PHOTO: https://pbs.twimg.com/ext_tw_video_thumb/1153281467121831938/pu/img/E9obyW9yqFLqL4aN.jpg ******************** 101538 - 1153281467121831938 ******************** PHOTO: https://pbs.twimg.com/ext_tw_video_thumb/1152226566652518406/pu/img/30CitbamtRq0VWge.jpg ******************** 90413 - 1152226566652518406 ******************** PHOTO: https://pbs.twimg.com/ext_tw_video_thumb/1150890766107066369/pu/img/Xfaaja0nGSz22hK3.jpg ******************** 84206 - 1150890766107066369 ********************
most_retweeted_posts = data.get_most_retweeted(100,has_media=True)
# Save populars posts
import pickle
with open("100_most_retweeted_posts.pickle",'wb') as handler:
pickle.dump(most_retweeted_posts, handler)
# Recall popular posts
import pickle
with open("100_most_retweeted_posts.pickle",'rb') as handler:
most_retweeted_posts = pickle.load(handler)
import random
print("Ammount found: ", len(most_retweeted_posts))
for rt_count, tweet_id, key in random.sample(most_retweeted_posts[11:21], 10):
tweet = data.fetch_by_id(tweet_id)
if "renuncia" in tweet.data["text"].lower() or "puerto rico" in tweet.data["text"].lower() or "ricky" in tweet.data["text"].lower() or "rosell" in tweet.data["text"].lower():
print(tweet)
print("*"*20 + "\n" + str(rt_count) + " - " + str(tweet_id) + " - " + str(key) + "\n" + "*"*20 + "\n\n")
else:
# print(tweet.data["text"])
print(tweet)
print("*"*10 + "\n" + str(rt_count) + " - " + str(tweet_id) + " - " + str(key) + "\n\n")
Ammount found: 100 ID: 1151475227404185600 Text: Your daily dose of antidepressant https://t.co/7MXYSyobOf URL: https://twitter.com/2896099018/status/1151475227404185600 Retweet:False Original Tweet URL: Not applicable Quotes:False Quoted Tweet URL: Not applicable Has Media=True Has Local Media=True Media=['PHOTO: https://pbs.twimg.com/ext_tw_video_thumb/1151475207007326208/pu/img/upRlenku-1oo84bw.jpg', 'VIDEO: https://video.twimg.com/ext_tw_video/1151475207007326208/pu/vid/480x480/oNJBGtn57RnjFF9j.mp4?tag=10'] ********** 110941 - 1151475227404185600 - 992800.0001 ID: 1150890789033193481 Text: “Whose driving?” Everyone with cars: https://t.co/dtwxbdZJ04 URL: https://twitter.com/2544501210/status/1150890789033193481 Retweet:False Original Tweet URL: Not applicable Quotes:False Quoted Tweet URL: Not applicable Has Media=True Has Local Media=True Media=['PHOTO: https://pbs.twimg.com/ext_tw_video_thumb/1150890766107066369/pu/img/Xfaaja0nGSz22hK3.jpg', 'VIDEO: https://video.twimg.com/ext_tw_video/1150890766107066369/pu/vid/480x480/hUo98Vfa6UYdETtE.mp4?tag=10'] ********** 84206 - 1150890789033193481 - 298732.0001 ID: 1152968581862346752 Text: A thread for my non Spanish speaking followers explaining what’s going on in Puerto Rico 🇵🇷 https://t.co/2K5Qr3GOSu URL: https://twitter.com/1312036255/status/1152968581862346752 Retweet:False Original Tweet URL: Not applicable Quotes:False Quoted Tweet URL: Not applicable Has Media=True Has Local Media=True Media=['PHOTO: https://pbs.twimg.com/media/EAAqz9aXYAIkkbi.jpg', 'PHOTO: https://pbs.twimg.com/media/EAAqz9aXYAIkkbi.jpg'] ******************** 83996 - 1152968581862346752 - 1084631.0101 ******************** ID: 1152279015576784896 Text: 'They killed my mom and my six brothers' Trump: 'Where are they now?' 'They're dead' https://t.co/SfKSpAXyP1 URL: https://twitter.com/68752979/status/1152279015576784896 Retweet:False Original Tweet URL: Not applicable Quotes:False Quoted Tweet URL: Not applicable Has Media=True Has Local Media=True Media=['PHOTO: https://pbs.twimg.com/ext_tw_video_thumb/1152226566652518406/pu/img/30CitbamtRq0VWge.jpg', 'VIDEO: https://video.twimg.com/ext_tw_video/1152226566652518406/pu/vid/388x360/zS3fIb5QOntfBsRM.mp4?tag=10'] ********** 90413 - 1152279015576784896 - 923475.0101 ID: 1150516560018018305 Text: Me pretending I didn’t see someone I know in public https://t.co/drw8A5Nsl1 URL: https://twitter.com/534931954/status/1150516560018018305 Retweet:False Original Tweet URL: Not applicable Quotes:False Quoted Tweet URL: Not applicable Has Media=True Has Local Media=True Media=['PHOTO: https://pbs.twimg.com/ext_tw_video_thumb/1150516529064075265/pu/img/JtB7Z6VJDSrCqNJM.jpg', 'VIDEO: https://video.twimg.com/ext_tw_video/1150516529064075265/pu/vid/480x480/d9vgRMGh1-S6MPE4.mp4?tag=10'] ********** 118018 - 1150516560018018305 - 208876.0001 ID: 1152313246285742081 Text: me trying to lose 5 kilos in 5 minutes https://t.co/KUDRKP0ar0 URL: https://twitter.com/241292536/status/1152313246285742081 Retweet:False Original Tweet URL: Not applicable Quotes:False Quoted Tweet URL: Not applicable Has Media=True Has Local Media=True Media=['PHOTO: https://pbs.twimg.com/ext_tw_video_thumb/1151738583000264704/pu/img/M_mUM7-RiqQHIFwB.jpg', 'VIDEO: https://video.twimg.com/ext_tw_video/1151738583000264704/pu/vid/360x640/rcnaRAh7J8qNmSgo.mp4?tag=10'] ********** 112370 - 1152313246285742081 - 865476.0001 ID: 1152025567794872320 Text: Buddy wanted $24.95 for a picture of me on the roller coaster https://t.co/SwbdLRGfby URL: https://twitter.com/15355278/status/1152025567794872320 Retweet:False Original Tweet URL: Not applicable Quotes:False Quoted Tweet URL: Not applicable Has Media=True Has Local Media=True Media=['PHOTO: https://pbs.twimg.com/ext_tw_video_thumb/1152025523633213440/pu/img/dzz33EwahtHZIV1B.jpg', 'VIDEO: https://video.twimg.com/ext_tw_video/1152025523633213440/pu/vid/360x638/g6a5mSdO5BvJOtdU.mp4?tag=10'] ********** 76575 - 1152025567794872320 - 899336.0101 ID: 1151645825036148736 Text: SOMEONE ADDED BALLS TO THIS SCENE AND NOW IT'S AMAZING😂 https://t.co/NTJDbhwFFv URL: https://twitter.com/595688846/status/1151645825036148736 Retweet:False Original Tweet URL: Not applicable Quotes:False Quoted Tweet URL: Not applicable Has Media=True Has Local Media=True Media=['PHOTO: https://pbs.twimg.com/ext_tw_video_thumb/1151645615660699659/pu/img/IRu1lKdxKDEusUcg.jpg', 'VIDEO: https://video.twimg.com/ext_tw_video/1151645615660699659/pu/vid/640x360/8CtIpQFodrv4Dv-B.mp4?tag=10'] ********** 103496 - 1151645825036148736 - 956056.0001 ID: 1152737403913854977 Text: Me happily listening to the same 6 songs everyday https://t.co/8goDOTSLvw URL: https://twitter.com/1027329545064603648/status/1152737403913854977 Retweet:False Original Tweet URL: Not applicable Quotes:False Quoted Tweet URL: Not applicable Has Media=True Has Local Media=True Media=['PHOTO: https://pbs.twimg.com/ext_tw_video_thumb/1152737371533787140/pu/img/8a1Vv8-Nyrwg2cSn.jpg', 'VIDEO: https://video.twimg.com/ext_tw_video/1152737371533787140/pu/vid/360x708/xZMzjSu4y_hXyDhZ.mp4?tag=10'] ********** 148330 - 1152737403913854977 - 1085243.0101 ID: 1153281489976606720 Text: Next we work in a little conditioner. https://t.co/ffveNaXU72 URL: https://twitter.com/2262255041/status/1153281489976606720 Retweet:False Original Tweet URL: Not applicable Quotes:False Quoted Tweet URL: Not applicable Has Media=True Has Local Media=True Media=['PHOTO: https://pbs.twimg.com/ext_tw_video_thumb/1153281467121831938/pu/img/E9obyW9yqFLqL4aN.jpg', 'VIDEO: https://video.twimg.com/ext_tw_video/1153281467121831938/pu/vid/360x520/XqCo21S7WuPPbC48.mp4?tag=10'] ********** 101538 - 1153281489976606720 - 542924.0101
# randint(0,SAMPLE_SIZE-6)
# print(data.head(5, 40, sep="\n" + "*"*100 + "\n\n"))
#RickyRenuncia
#RickyVeteYa
print(data.head(5, randint(0,SAMPLE_SIZE-6), sep="\n" + "*"*100 + "\n\n"))
ID: 1153081436636925952 Text: RT @CortesBob: This will only add fuel to the protesting fire. @ricardorossello has lost all trust to govern & the people do not believe in… URL: https://twitter.com/3222769285/status/1153081436636925952 Retweet:True Original Tweet URL: https://twitter.com/612777130/status/1153064207472123905 Quotes:False Quoted Tweet URL: Not applicable Media=[] **************************************************************************************************** ID: 1153081432895631367 Text: RT @carlosdelgado21: El gobernador @ricardorossello sigue jugando a la politica. Esto no se trata del partido ni de la presidencia del mism… URL: https://twitter.com/550143070/status/1153081432895631367 Retweet:True Original Tweet URL: https://twitter.com/39184279/status/1153065364827385856 Quotes:False Quoted Tweet URL: Not applicable Media=[] **************************************************************************************************** ID: 1153081430232195072 Text: RT @Tommy_Torres: La generación de Benito no cree en el Ay Bendito. #RickyRenuncia https://t.co/ZdZL6dRiE1 URL: https://twitter.com/2723666981/status/1153081430232195072 Retweet:True Original Tweet URL: https://twitter.com/61359460/status/1152693083164807168 Quotes:False Quoted Tweet URL: Not applicable Media=['PHOTO: https://pbs.twimg.com/media/D_8wPoCXoAEq8jy.jpg', 'PHOTO: https://pbs.twimg.com/media/D_8wPoCXoAEq8jy.jpg', 'PHOTO: https://pbs.twimg.com/media/D_8wPoAWwAARYrz.jpg', 'PHOTO: https://pbs.twimg.com/media/D_8wPn_WwAEJE-y.jpg'] **************************************************************************************************** ID: 1153081425329041408 Text: RT @pjsinsuela: 👑 Mañana 7AM nos vemos en el #ParoNaciona a sacar a este bacalao' de su gobernación y mejorar nuestra situación. ¡Puerto Ri… URL: https://twitter.com/858487597/status/1153081425329041408 Retweet:True Original Tweet URL: https://twitter.com/222536921/status/1153076413643276289 Quotes:False Quoted Tweet URL: Not applicable Media=[] **************************************************************************************************** ID: 1153081422586023936 Text: RT @Samynemir: 20 years ago, #PuertoRico mobilized to remove US Marine- world’s most powerful military- out of Vieques as it was bombarded… URL: https://twitter.com/1459766030/status/1153081422586023936 Retweet:True Original Tweet URL: https://twitter.com/128790234/status/1152783400127930371 Quotes:False Quoted Tweet URL: Not applicable Media=[] ****************************************************************************************************
print(data.head(2, sep="\n*************\n"))
<class 'AttributeError'>
("'TweetAnalyzer' object has no attribute 'media'",)
'TweetAnalyzer' object has no attribute 'media'
print(type(data.retweet_cache))
print(str(data.retweet_cache.keys())[:400])
print(str(data.retweet_cache)[:400])
<class 'dict'>
dict_keys([25, 0, 2894, 6, 2, 4, 205, 106, 1291, 23, 121, 1458, 256, 3, 274, 8, 834, 30, 1, 586, 431, 13, 816, 49, 160, 2106, 1604, 210, 107, 27, 10, 1711, 994, 396, 1520, 17, 176, 92, 93, 162, 21, 63, 48, 45, 4326, 87, 1673, 257, 923, 24, 123, 139, 163, 481, 36, 1450, 54, 932, 119, 801, 275, 22, 7, 247, 32, 213, 143, 71, 18, 383, 185, 44, 700, 301, 50, 182, 368, 242, 28, 11, 207, 219, 955, 5, 53,
{25: [{'id': 1152013350382915589, 'id_str': '1152013350382915589', 'user': {'id': 1057328811585560576, 'id_str': '1057328811585560576'}, 'jsonl_key': '1152013350382915589'}, {'id': 1152013299300438016, 'id_str': '1152013299300438016', 'user': {'id': 906410164185649152, 'id_str': '906410164185649152'}, 'jsonl_key': '1152013299300438016'}, {'id': 1152012777365446656, 'id_str': '1152012777365446656',
print(data.retweet_cache[0][0])
print(str(data.quoteOf)[:400])
print(str(data.retweetOf)[:400])
print(str(data.retweet_cache)[:400])
retweet_counts = list(data.retweet_cache.keys())
retweet_counts.sort(reverse=True)
quote_counts = list(data.quote_cache.keys())
quote_counts.sort(reverse=True)
print(str(retweet_counts)[:400])
print(str(quote_counts)[:400])
{'id': 1152013349267234817, 'id_str': '1152013349267234817', 'user': {'id': 474765264, 'id_str': '474765264'}, 'jsonl_key': '1152013349267234817'}
{'https://twitter.com/residente/status/1151965929925959680': 49, 'https://twitter.com/i/web/status/1152013334717198336': 1, 'https://twitter.com/perlalessandra/status/1152001977355603968': 1112, 'https://twitter.com/petebuttigieg/status/1151993436536393729': 17, 'https://twitter.com/musiccapos/status/1152005696260464641': 56, 'not applicable': 50119, 'https://twitter.com/idislikegabo/status/115200
{'https://twitter.com/1232457985/status/1151967163646926855': 39, 'https://twitter.com/1311098821/status/1151689632955928577': 2268, 'https://twitter.com/1052162276399243264/status/1151965615080521728': 6, 'https://twitter.com/741592836/status/1152012539082858496': 1, 'https://twitter.com/2523919437/status/1152012188279701504': 33, 'https://twitter.com/1915636538/status/1152008881909858305': 1169,
{25: [{'id': 1152013350382915589, 'id_str': '1152013350382915589', 'user': {'id': 1057328811585560576, 'id_str': '1057328811585560576'}, 'jsonl_key': '1152013350382915589'}, {'id': 1152013299300438016, 'id_str': '1152013299300438016', 'user': {'id': 906410164185649152, 'id_str': '906410164185649152'}, 'jsonl_key': '1152013299300438016'}, {'id': 1152012777365446656, 'id_str': '1152012777365446656',
[20982, 20981, 20980, 20978, 20977, 20976, 20935, 20933, 20932, 20931, 20930, 20929, 20928, 20927, 20925, 20924, 20923, 20922, 20921, 20920, 20918, 20917, 8010, 8009, 8001, 6865, 6864, 6855, 6854, 6853, 6852, 6851, 6850, 6849, 6182, 6181, 6180, 6179, 6178, 6177, 6173, 6172, 6171, 6170, 6169, 5717, 5716, 5715, 5656, 5655, 5654, 5653, 5652, 5651, 5650, 5649, 5547, 5546, 5545, 5544, 5504, 5502, 5501,
[]
import json
sample_t = data.fetch_by_position(112)
print(json.dumps(sample_t.data, indent=4))
{
"contributors": null,
"truncated": false,
"text": "RT @kellydiazr: AHORA: \n\ud83d\udea8\ud83d\udea8\ud83d\udea8A la corilla de La Fiebre que sal\u00eda hoy hacia Fortaleza les hicieron un bloqueo ilegal, los hartaron a tickets y\u2026",
"is_quote_status": false,
"in_reply_to_status_id": null,
"id": 1152013081255407618,
"favorite_count": 0,
"source": "<a href=\"http://twitter.com/download/iphone\" rel=\"nofollow\">Twitter for iPhone</a>",
"retweeted": false,
"coordinates": null,
"entities": {
"symbols": [],
"user_mentions": [
{
"indices": [
3,
14
],
"id_str": "1915636538",
"screen_name": "kellydiazr",
"name": "K. D\u00edaz",
"id": 1915636538
}
],
"hashtags": [],
"urls": []
},
"in_reply_to_screen_name": null,
"in_reply_to_user_id": null,
"retweet_count": 205,
"id_str": "1152013081255407618",
"favorited": false,
"retweeted_status": {
"contributors": null,
"truncated": true,
"text": "AHORA: \n\ud83d\udea8\ud83d\udea8\ud83d\udea8A la corilla de La Fiebre que sal\u00eda hoy hacia Fortaleza les hicieron un bloqueo ilegal, los hartaron a t\u2026 https://t.co/47bAazekHL",
"is_quote_status": false,
"in_reply_to_status_id": null,
"id": 1152008881909858305,
"favorite_count": 99,
"source": "<a href=\"http://twitter.com/download/iphone\" rel=\"nofollow\">Twitter for iPhone</a>",
"retweeted": false,
"coordinates": null,
"entities": {
"symbols": [],
"user_mentions": [],
"hashtags": [],
"urls": [
{
"url": "https://t.co/47bAazekHL",
"indices": [
117,
140
],
"expanded_url": "https://twitter.com/i/web/status/1152008881909858305",
"display_url": "twitter.com/i/web/status/1\u2026"
}
]
},
"in_reply_to_screen_name": null,
"in_reply_to_user_id": null,
"retweet_count": 205,
"id_str": "1152008881909858305",
"favorited": false,
"user": {
"follow_request_sent": false,
"has_extended_profile": true,
"profile_use_background_image": false,
"contributors_enabled": false,
"id": 1915636538,
"verified": false,
"translator_type": "regular",
"profile_text_color": "000000",
"profile_image_url_https": "https://pbs.twimg.com/profile_images/1143401892854321153/sjw9lvDu_normal.jpg",
"profile_sidebar_fill_color": "000000",
"entities": {
"url": {
"urls": [
{
"url": "https://t.co/ke4r46jzGu",
"indices": [
0,
23
],
"expanded_url": "http://kellydiazrodriguez.wordpress.com",
"display_url": "kellydiazrodriguez.wordpress.com"
}
]
},
"description": {
"urls": []
}
},
"followers_count": 3372,
"profile_sidebar_border_color": "000000",
"id_str": "1915636538",
"default_profile_image": false,
"listed_count": 56,
"is_translation_enabled": false,
"utc_offset": null,
"statuses_count": 69543,
"description": "Escribo cosas, brego con libros y como nada es perfecto: no s\u00e9 bailar salsa. | #RickyRenuncia",
"friends_count": 301,
"location": "Sanqueerce, Puerto Rico ",
"profile_link_color": "000456",
"profile_image_url": "http://pbs.twimg.com/profile_images/1143401892854321153/sjw9lvDu_normal.jpg",
"notifications": false,
"geo_enabled": true,
"profile_background_color": "000000",
"profile_banner_url": "https://pbs.twimg.com/profile_banners/1915636538/1462228528",
"profile_background_image_url": "http://abs.twimg.com/images/themes/theme1/bg.png",
"screen_name": "kellydiazr",
"lang": null,
"following": false,
"profile_background_tile": false,
"favourites_count": 21839,
"name": "K. D\u00edaz",
"url": "https://t.co/ke4r46jzGu",
"created_at": "Sat Sep 28 23:39:07 +0000 2013",
"profile_background_image_url_https": "https://abs.twimg.com/images/themes/theme1/bg.png",
"time_zone": null,
"protected": false,
"default_profile": false,
"is_translator": false
},
"geo": null,
"in_reply_to_user_id_str": null,
"lang": "es",
"created_at": "Fri Jul 19 00:14:55 +0000 2019",
"in_reply_to_status_id_str": null,
"place": null,
"metadata": {
"iso_language_code": "es",
"result_type": "recent"
}
},
"user": {
"follow_request_sent": false,
"has_extended_profile": true,
"profile_use_background_image": true,
"contributors_enabled": false,
"id": 1141124347534462976,
"verified": false,
"translator_type": "none",
"profile_text_color": "333333",
"profile_image_url_https": "https://pbs.twimg.com/profile_images/1151986107065593856/6wM9_Pvq_normal.jpg",
"profile_sidebar_fill_color": "DDEEF6",
"entities": {
"description": {
"urls": []
}
},
"followers_count": 75,
"profile_sidebar_border_color": "C0DEED",
"id_str": "1141124347534462976",
"default_profile_image": false,
"listed_count": 0,
"is_translation_enabled": false,
"utc_offset": null,
"statuses_count": 1503,
"description": "\u00a1Viva la vida!",
"friends_count": 156,
"location": "00767",
"profile_link_color": "1DA1F2",
"profile_image_url": "http://pbs.twimg.com/profile_images/1151986107065593856/6wM9_Pvq_normal.jpg",
"notifications": false,
"geo_enabled": false,
"profile_background_color": "F5F8FA",
"profile_banner_url": "https://pbs.twimg.com/profile_banners/1141124347534462976/1563484778",
"profile_background_image_url": null,
"screen_name": "floreessz",
"lang": null,
"following": false,
"profile_background_tile": false,
"favourites_count": 297,
"name": "Flo\ud83e\udd8b",
"url": null,
"created_at": "Tue Jun 18 23:23:40 +0000 2019",
"profile_background_image_url_https": null,
"time_zone": null,
"protected": false,
"default_profile": true,
"is_translator": false
},
"geo": null,
"in_reply_to_user_id_str": null,
"lang": "es",
"created_at": "Fri Jul 19 00:31:36 +0000 2019",
"in_reply_to_status_id_str": null,
"place": null,
"metadata": {
"iso_language_code": "es",
"result_type": "recent"
}
}
# Find a video tweet
import json
from tweet_rehydrate.analysis import TweetJLAnalyzer, TweetAnalyzer, getsizeof
from random import randint
JL_DATA="/home/rickyrenuncia/tweetsRickyRenuncia-final.jsonl"
SAMPLE_SIZE = 1113758
count = 0
media_ids=[]
with open(JL_DATA,'r') as data_file:
for _ in range(SAMPLE_SIZE):
count+=1
if count%200000 == 0:
print(f"Done with: {count}")
tweet = TweetAnalyzer(data_file.readline())
if tweet.hasMedia:
# print("HasMedia",tweet.hasMedia)
if len(tweet.media) > 0:
for m in tweet.media:
if m.mtype().lower() != "photo" and m.id not in media_ids:
media_ids.append(m.id)
print(m.id, m.mtype(), m.url())
# print(m.data)
else:
print("Length 0??")
try:
print(tweet.data["entities"]["media"])
except:
print("No Media at HERE")
try:
print(tweet.data["retweeted_status"]["entities"]["media"])
except:
print("No Media at RETWEET_STATUS")
print(json.dumps(tweet.data))
break
print(f"DONE: {count}")
1151689205996822528 video https://video.twimg.com/ext_tw_video/1151689205996822528/pu/vid/360x778/YrQ6FVlC2kMHWMnO.mp4?tag=10 1152001927623761923 video https://video.twimg.com/ext_tw_video/1152001927623761923/pu/vid/640x360/scZWK1m35HHcj-of.mp4?tag=10 1152005409303121921 video https://video.twimg.com/ext_tw_video/1152005409303121921/pu/vid/640x360/U0BllslXNll2b3_e.mp4?tag=10 1152009003687448576 video https://video.twimg.com/ext_tw_video/1152009003687448576/pu/vid/640x360/5zbRUPzUFcmOp-7L.mp4?tag=10 1151860930466263042 video https://video.twimg.com/ext_tw_video/1151860930466263042/pu/vid/360x640/lYd2puCmVM5dX_4f.mp4?tag=10 1151938358534377472 video https://video.twimg.com/ext_tw_video/1151938358534377472/pu/vid/360x640/KzDJbiPaj0l-J1u6.mp4?tag=10 1152008287715561472 video https://video.twimg.com/ext_tw_video/1152008287715561472/pu/vid/640x360/n2oHa2lcp_BKsoDV.mp4?tag=10 1151678566246932480 video https://video.twimg.com/ext_tw_video/1151678566246932480/pu/vid/640x360/2LhsAHtiG0rebrsi.mp4?tag=10 1151862717927608322 video https://video.twimg.com/ext_tw_video/1151862717927608322/pu/vid/640x360/UOMrOVNZx_klCaaB.mp4?tag=10 1152004915192979456 video https://video.twimg.com/ext_tw_video/1152004915192979456/pu/vid/360x640/roiy9_Jn2aGSH7SQ.mp4?tag=10 1151606800719974412 video https://video.twimg.com/ext_tw_video/1151606800719974412/pu/vid/360x640/dQgx5SxsxJemccS_.mp4?tag=10 1151651246803226626 video https://video.twimg.com/ext_tw_video/1151651246803226626/pu/vid/360x636/W2q4IlxSBfyKXb6u.mp4?tag=10 1151489207971536897 video https://video.twimg.com/ext_tw_video/1151489207971536897/pu/vid/848x432/ExtgtUXDU-IDvPvV.mp4?tag=10 1151296144368160773 video https://video.twimg.com/ext_tw_video/1151296144368160773/pu/vid/480x480/OOm7rY7c8FKPR59F.mp4?tag=10 1151717235632877568 video https://video.twimg.com/ext_tw_video/1151717235632877568/pu/vid/360x640/nAAV65cIK59UNA9w.mp4?tag=10 1151992913573793794 video https://video.twimg.com/ext_tw_video/1151992913573793794/pu/vid/360x778/0tAZLRjMOklmaoZ1.mp4?tag=10 1151997249703964672 video https://video.twimg.com/ext_tw_video/1151997249703964672/pu/vid/480x480/-clEuROvWDIhsiRI.mp4?tag=10 1151992251536404480 video https://video.twimg.com/ext_tw_video/1151992251536404480/pu/vid/480x480/6_SPdX_Nz0v5KdVi.mp4?tag=10 1151729254989799424 video https://video.twimg.com/ext_tw_video/1151729254989799424/pu/vid/640x360/36tZfXKlePOw7ReD.mp4?tag=10 1151994618000490496 video https://video.twimg.com/ext_tw_video/1151994618000490496/pu/vid/404x720/LluZGZPn5oBCQsPJ.mp4?tag=10 1134728835381882880 video https://video.twimg.com/ext_tw_video/1134728835381882880/pu/vid/480x480/9n5vZ0lxtMOwNcwl.mp4?tag=9 1151602953159086083 video https://video.twimg.com/ext_tw_video/1151602953159086083/pu/vid/360x640/wKJukDjDGIqrcdgr.mp4?tag=10 1151951734698336256 video https://video.twimg.com/ext_tw_video/1151951734698336256/pu/vid/360x640/afU1Zrf5PPl6Txpv.mp4?tag=10 1151981723502227456 video https://video.twimg.com/ext_tw_video/1151981723502227456/pu/vid/360x640/d6wTdVxIE2Sc0loY.mp4?tag=10 1152011962450153474 video https://video.twimg.com/ext_tw_video/1152011962450153474/pu/vid/640x360/wgFz3d9tBHK-7-8p.mp4?tag=10 1151954257681121280 video https://video.twimg.com/ext_tw_video/1151954257681121280/pu/vid/720x406/R23ruNUB-t0JQG39.mp4?tag=10 1151998880021544960 video https://video.twimg.com/ext_tw_video/1151998880021544960/pu/vid/352x640/Sdhe3xPKioIh__98.mp4?tag=10 1151632834567782401 video https://video.twimg.com/ext_tw_video/1151632834567782401/pu/vid/360x778/IPHcKGhUxemIgmcH.mp4?tag=10 1151974606741413889 video https://video.twimg.com/ext_tw_video/1151974606741413889/pu/vid/360x778/NZKLJEy8ugyzdSA0.mp4?tag=10 1151711345882140672 video https://video.twimg.com/ext_tw_video/1151711345882140672/pu/vid/360x640/93M0ICMAxOP_xqLb.mp4?tag=10 1151640522378874880 video https://video.twimg.com/ext_tw_video/1151640522378874880/pu/vid/360x640/I2vPfwsjaNsmRlBb.mp4?tag=10 1151682719530790914 video https://video.twimg.com/ext_tw_video/1151682719530790914/pu/vid/640x360/5naQSgbrfKd4E8CD.mp4?tag=10 1152012170525327361 animated_gif http://pbs.twimg.com/tweet_video_thumb/D_zE9ykWwAEGfMa.jpg 1152012062240989184 video https://video.twimg.com/ext_tw_video/1152012062240989184/pu/vid/640x360/Ehe1vHrzyOv0MaOh.mp4?tag=10 1151924609253552130 video https://video.twimg.com/ext_tw_video/1151924609253552130/pu/vid/360x740/DuxEzVQDjHo0gt-t.mp4?tag=10 1151618224531955713 video https://video.twimg.com/ext_tw_video/1151618224531955713/pu/vid/848x464/UT-hMJGsNtUrFLOX.mp4?tag=10 1151966952044290069 animated_gif http://pbs.twimg.com/tweet_video_thumb/D_yb1umW4BU-qLB.jpg 1151943150761631746 video https://video.twimg.com/ext_tw_video/1151943150761631746/pu/vid/320x576/mp6vpNcKTUt2SI0c.mp4?tag=10 1151679892746133505 video https://video.twimg.com/ext_tw_video/1151679892746133505/pu/vid/640x360/pF40wNij1xPlF-EF.mp4?tag=10 1152006648854003713 video https://video.twimg.com/ext_tw_video/1152006648854003713/pu/vid/320x320/I6sXYQuUKH4cnUnw.mp4?tag=10 1152007464323186688 video https://video.twimg.com/ext_tw_video/1152007464323186688/pu/vid/640x360/8KL0EN52UUuCWnmY.mp4?tag=10 1152010519857025031 video https://video.twimg.com/ext_tw_video/1152010519857025031/pu/vid/360x492/J-HMsLOwWR_vWWQt.mp4?tag=10 1152010870588870659 video https://video.twimg.com/ext_tw_video/1152010870588870659/pu/vid/360x640/Etpbzxg9Oj3ZAU4g.mp4?tag=10 1151835204518129664 video https://video.twimg.com/ext_tw_video/1151835204518129664/pu/vid/360x640/4IxjMS2DpnKAccq2.mp4?tag=10 1151639194353250306 video https://video.twimg.com/ext_tw_video/1151639194353250306/pu/vid/360x640/mHtKEXbWkzICkPbK.mp4?tag=10 1152009640328216576 video https://video.twimg.com/ext_tw_video/1152009640328216576/pu/vid/360x640/sRb81ORwf_Dd-VoY.mp4?tag=10 1152008328412688385 animated_gif http://pbs.twimg.com/tweet_video_thumb/D_zBeJlUcAELdMd.jpg 1152005905124212736 video https://video.twimg.com/ext_tw_video/1152005905124212736/pu/vid/640x360/UqVyyYXceDkiLJo8.mp4?tag=10 1152010601247465472 video https://video.twimg.com/ext_tw_video/1152010601247465472/pu/vid/408x360/H3-ZukPydHd50rSV.mp4?tag=10 1151914095303634944 video https://video.twimg.com/ext_tw_video/1151914095303634944/pu/vid/400x400/pgJFBvGhswgwh-zh.mp4?tag=10 1151575535300087809 video https://video.twimg.com/ext_tw_video/1151575535300087809/pu/vid/640x360/gd-WrgYNjEz8bT8m.mp4?tag=10 1152010180617494528 video https://video.twimg.com/ext_tw_video/1152010180617494528/pu/vid/480x480/gL6ijOx6cwPUPm0M.mp4?tag=10 1151588776927473665 video https://video.twimg.com/ext_tw_video/1151588776927473665/pu/vid/360x640/TMMoXw3dFLcBstbm.mp4?tag=10 1152010272980291584 animated_gif http://pbs.twimg.com/tweet_video_thumb/D_zDPVqXUAArLvm.jpg 1152009755675758592 video https://video.twimg.com/ext_tw_video/1152009755675758592/pu/vid/360x714/2zDaWCL4wEAhA1_6.mp4?tag=10 1151681158561837058 video https://video.twimg.com/ext_tw_video/1151681158561837058/pu/vid/360x640/5wPGHP7onh158aUX.mp4?tag=10 1151995822663507969 video https://video.twimg.com/amplify_video/1151995822663507969/vid/480x480/kay_lmlC8l4zdhHb.mp4?tag=13 1151993416244322306 video https://video.twimg.com/ext_tw_video/1151993416244322306/pu/vid/360x640/vCkB54AyWYhVdtTV.mp4?tag=10 1151999062335352832 video https://video.twimg.com/ext_tw_video/1151999062335352832/pu/vid/360x636/nRkqBotkutNaHFz5.mp4?tag=10 1152008614326022144 video https://video.twimg.com/ext_tw_video/1152008614326022144/pu/vid/480x270/xU7qFNNDZ2S8iVyA.mp4?tag=10 1151979442211831808 animated_gif http://pbs.twimg.com/tweet_video_thumb/D_ynMwGWwAAwE8_.jpg 1152009641905348610 video https://video.twimg.com/ext_tw_video/1152009641905348610/pu/vid/480x480/BKC7z1R0cAJ89xfV.mp4?tag=10 1151986865286733824 video https://video.twimg.com/ext_tw_video/1151986865286733824/pu/vid/360x640/gIq7FfQjFuiGGFDf.mp4?tag=10 1152009312140767232 animated_gif http://pbs.twimg.com/tweet_video_thumb/D_zCXaQXsAARuE0.jpg 1151559268669317120 video https://video.twimg.com/ext_tw_video/1151559268669317120/pu/vid/360x640/bPri0vCp6TCuBjnB.mp4?tag=10 1152008485745451009 video https://video.twimg.com/ext_tw_video/1152008485745451009/pu/vid/360x640/D8r0-JySVO8jUnlw.mp4?tag=10 1151709497473359872 video https://video.twimg.com/ext_tw_video/1151709497473359872/pu/vid/360x640/JpMN-E3rEuURMXy0.mp4?tag=10 1151651237219241986 video https://video.twimg.com/ext_tw_video/1151651237219241986/pu/vid/360x778/lAtX7EV_UfXcd15r.mp4?tag=10 1151953189090025475 video https://video.twimg.com/ext_tw_video/1151953189090025475/pu/vid/640x352/FXO581qi3PLVQC0G.mp4?tag=10 1151716406729330689 animated_gif http://pbs.twimg.com/tweet_video_thumb/D_u3-EkXUAElJsM.jpg 1151649073688195072 video https://video.twimg.com/ext_tw_video/1151649073688195072/pu/vid/360x640/BwmAmAC1rnqrlW8L.mp4?tag=10 1151708652425875457 video https://video.twimg.com/ext_tw_video/1151708652425875457/pu/vid/360x640/cxqOMM1qD2NhMtVl.mp4?tag=10 1150923825174986752 video https://video.twimg.com/ext_tw_video/1150923825174986752/pu/vid/640x360/rKwtbhw9iGAjBD00.mp4?tag=10 1151662444688920576 video https://video.twimg.com/ext_tw_video/1151662444688920576/pu/vid/360x640/qAHJaP818TpSU8L8.mp4?tag=10 1152007420304117765 video https://video.twimg.com/ext_tw_video/1152007420304117765/pu/vid/360x696/UcYXCPFGR0EJ5boT.mp4?tag=10 1151784747804348416 video https://video.twimg.com/ext_tw_video/1151784747804348416/pu/vid/360x450/h7Z_i_hOXaVrtCw8.mp4?tag=10 1152008415365013505 animated_gif http://pbs.twimg.com/tweet_video_thumb/D_zBjNgXoAEqaus.jpg 1151865997978198019 video https://video.twimg.com/ext_tw_video/1151865997978198019/pu/vid/360x480/lqb4kGU_ynu1vXDG.mp4?tag=10 1151844659465003008 video https://video.twimg.com/ext_tw_video/1151844659465003008/pu/vid/360x640/rqBwLR5ZenDI09_k.mp4?tag=10 1151709982519435265 video https://video.twimg.com/ext_tw_video/1151709982519435265/pu/vid/360x640/Ng7Fn3C6CkBFSVZC.mp4?tag=10 1152007994512744453 video https://video.twimg.com/ext_tw_video/1152007994512744453/pu/vid/848x448/OayCuwemEhahniMf.mp4?tag=10 1151958781812727824 video https://video.twimg.com/ext_tw_video/1151958781812727824/pu/vid/360x636/qB7URnG6MEW1praR.mp4?tag=10 1152005551020077056 video https://video.twimg.com/ext_tw_video/1152005551020077056/pu/vid/360x640/OcQNmuMXKrVDi7ii.mp4?tag=10 1151673505244745732 video https://video.twimg.com/ext_tw_video/1151673505244745732/pu/vid/640x360/5Mbw7v7H-7-46Iqg.mp4?tag=10 1152000979589459970 video https://video.twimg.com/ext_tw_video/1152000979589459970/pu/vid/640x360/TOlk9tanGb_cXB8N.mp4?tag=10 1151617975541207043 video https://video.twimg.com/ext_tw_video/1151617975541207043/pu/vid/360x640/-qRU4bzypLcZdyGc.mp4?tag=10 1151628491584094210 video https://video.twimg.com/ext_tw_video/1151628491584094210/pu/vid/640x360/OkPpLfNLs6-2mtp5.mp4?tag=10 1151652331416621056 video https://video.twimg.com/ext_tw_video/1151652331416621056/pu/vid/360x636/ASpzFQXuc9Pg_ndt.mp4?tag=10 1151260315969097728 video https://video.twimg.com/ext_tw_video/1151260315969097728/pu/vid/360x640/Let6KxkIKIbgd6oM.mp4?tag=10 1152006165204819968 video https://video.twimg.com/ext_tw_video/1152006165204819968/pu/vid/360x640/xzl5jLeCPkYhMasK.mp4?tag=10 1152006028566790144 video https://video.twimg.com/ext_tw_video/1152006028566790144/pu/vid/360x778/j-DpYkdnZYUWBDli.mp4?tag=10 1152005971675209729 animated_gif http://pbs.twimg.com/tweet_video_thumb/D_y_U-DUYAEzfIB.jpg 1151983973482078210 video https://video.twimg.com/ext_tw_video/1151983973482078210/pu/vid/480x480/AvtfFxDgBTU4iyVC.mp4?tag=10 1152005506371686402 video https://video.twimg.com/ext_tw_video/1152005506371686402/pu/vid/360x640/SPb4nNpQjoyTUbJs.mp4?tag=10 1152005085490114561 animated_gif http://pbs.twimg.com/tweet_video_thumb/D_y-hYwU8AEVKpR.jpg 1152005097720885248 video https://video.twimg.com/ext_tw_video/1152005097720885248/pu/vid/640x360/m5D_70RJjCIbP7wB.mp4?tag=10 1151016520618823680 video https://video.twimg.com/ext_tw_video/1151016520618823680/pu/vid/352x640/P811vW5sYvGbGoJN.mp4?tag=10 1151916269265915911 video https://video.twimg.com/ext_tw_video/1151916269265915911/pu/vid/480x268/V73r-dvW48Fm9rhn.mp4?tag=10 1152004936953057280 animated_gif http://pbs.twimg.com/tweet_video_thumb/D_y-YvaVUAAM6Pq.jpg 1152003376437088256 video https://video.twimg.com/ext_tw_video/1152003376437088256/pu/vid/480x360/msY5pciq6kwulqyh.mp4?tag=10 1151714024637292544 video https://video.twimg.com/ext_tw_video/1151714024637292544/pu/vid/304x288/ESwPqtbgeB1eDEeg.mp4?tag=10 1152001689211174913 video https://video.twimg.com/ext_tw_video/1152001689211174913/pu/vid/640x352/CLQGbKwFfdEb-FvU.mp4?tag=10 1152004152765009920 video https://video.twimg.com/ext_tw_video/1152004152765009920/pu/vid/360x778/1x5a3Ug1DRKeTrlB.mp4?tag=10 1151720342135746560 video https://video.twimg.com/ext_tw_video/1151720342135746560/pu/vid/640x360/eyTzyffJc62seEPW.mp4?tag=10 1151617667700264960 video https://video.twimg.com/ext_tw_video/1151617667700264960/pu/vid/432x848/MbBrQ-CReNFbLQeD.mp4?tag=10 1151963060350914561 video https://video.twimg.com/ext_tw_video/1151963060350914561/pu/vid/360x778/svkxPNeP5UQma8ki.mp4?tag=10 1151675190276849664 video https://video.twimg.com/ext_tw_video/1151675190276849664/pu/vid/640x360/U2b91efw1ynwJ4D0.mp4?tag=10 1152000183883821056 video https://video.twimg.com/ext_tw_video/1152000183883821056/pu/vid/720x406/imKeKvIWj8B2sav4.mp4?tag=10 1152002410769833985 video https://video.twimg.com/ext_tw_video/1152002410769833985/pu/vid/360x640/tP_QzKXdLeUi1ioW.mp4?tag=10 1152002077255540736 video https://video.twimg.com/ext_tw_video/1152002077255540736/pu/vid/360x640/JWkNuxtulN5EdtQS.mp4?tag=10 1152002988090650624 video https://video.twimg.com/ext_tw_video/1152002988090650624/pu/vid/360x640/jnkagd2-8Jzeh_KI.mp4?tag=10 1152002971040796672 animated_gif http://pbs.twimg.com/tweet_video_thumb/D_y8mT0U0AA8AN2.jpg 1151964697652027397 video https://video.twimg.com/ext_tw_video/1151964697652027397/pu/vid/224x400/nROjRQhy4kPK9bL3.mp4?tag=10 1151923646836936704 video https://video.twimg.com/ext_tw_video/1151923646836936704/pu/vid/360x640/mAptjBc6ejYliwiu.mp4?tag=10 1152001752385744896 video https://video.twimg.com/ext_tw_video/1152001752385744896/pu/vid/640x360/MdzU6iadzP51IWTA.mp4?tag=10 1151955616820322308 video https://video.twimg.com/ext_tw_video/1151955616820322308/pu/vid/360x778/EXiaCgOKD2FVsLyX.mp4?tag=10 1151696199742873601 video https://video.twimg.com/ext_tw_video/1151696199742873601/pu/vid/360x640/DzAsyvJgjT3I42vm.mp4?tag=10 1151963974801481732 video https://video.twimg.com/amplify_video/1151963974801481732/vid/640x360/i1gAIOZYF_7Ddyh5.mp4?tag=13 1152001648689991680 animated_gif http://pbs.twimg.com/tweet_video_thumb/D_y7ZVrU0AAllxf.jpg 1151687945079939072 video https://video.twimg.com/ext_tw_video/1151687945079939072/pu/vid/360x640/PrbaRsRmRdql-KUj.mp4?tag=10 1151911852953223168 video https://video.twimg.com/ext_tw_video/1151911852953223168/pu/vid/224x400/mJOHeWQq6Scw8DYY.mp4?tag=10 1151318921640677376 video https://video.twimg.com/ext_tw_video/1151318921640677376/pu/vid/360x640/4GbcVjgnhY6CD5FO.mp4?tag=10 1152000972693970945 video https://video.twimg.com/ext_tw_video/1152000972693970945/pu/vid/636x360/rVi-DiIJ_HJKAJ66.mp4?tag=10 1151895557918613506 video https://video.twimg.com/ext_tw_video/1151895557918613506/pu/vid/360x640/IMRpSMU9PjEfm6zn.mp4?tag=10 1151996735935303680 video https://video.twimg.com/ext_tw_video/1151996735935303680/pu/vid/636x360/P5AeoE5zrnZEoeo9.mp4?tag=10 1151943332765077504 video https://video.twimg.com/ext_tw_video/1151943332765077504/pu/vid/640x360/C6hT106WgxZQMc4-.mp4?tag=10 1151917573556047872 video https://video.twimg.com/ext_tw_video/1151917573556047872/pu/vid/360x480/5Jn3nCK97TBBBOOV.mp4?tag=10 1151852312937345024 video https://video.twimg.com/amplify_video/1151852312937345024/vid/480x480/t5cuBF6GZd6m23bg.mp4?tag=13 1151748079714000896 video https://video.twimg.com/ext_tw_video/1151748079714000896/pu/vid/624x360/aXrFv6R7B4477Q11.mp4?tag=10 1151994656755838976 video https://video.twimg.com/ext_tw_video/1151994656755838976/pu/vid/360x636/J12NFLS7_wAQAQMc.mp4?tag=10 1150958472365793283 video https://video.twimg.com/ext_tw_video/1150958472365793283/pu/vid/400x256/jkw-bRV0uFd4usvO.mp4?tag=10 1151689643332648961 video https://video.twimg.com/ext_tw_video/1151689643332648961/pu/vid/224x400/CBhrh7Lz54SZdMIy.mp4?tag=10 1151724833685811201 video https://video.twimg.com/ext_tw_video/1151724833685811201/pu/vid/640x360/lOT0sWf1arYBJvPw.mp4?tag=10 1151998802485661696 animated_gif http://pbs.twimg.com/tweet_video_thumb/D_y4zqvVUAAqmYL.jpg 1151993431243161601 video https://video.twimg.com/ext_tw_video/1151993431243161601/pu/vid/846x360/c68adcGTwgP-82PJ.mp4?tag=10 1150823476040806400 animated_gif http://pbs.twimg.com/tweet_video_thumb/D_iL2qiXsAAiNIi.jpg 1151872985860661248 video https://video.twimg.com/ext_tw_video/1151872985860661248/pu/vid/480x480/mTitNMHBB7s8Xpc3.mp4?tag=10 1151715193715601408 video https://video.twimg.com/ext_tw_video/1151715193715601408/pu/vid/640x360/fmUlt5qa5YQdaZ9k.mp4?tag=10 1151940191751307264 video https://video.twimg.com/ext_tw_video/1151940191751307264/pu/vid/360x778/FGCXmHI0r32dFKdO.mp4?tag=10 1151263998849011712 video https://video.twimg.com/ext_tw_video/1151263998849011712/pu/vid/360x640/AgskmFTJlu7CI68U.mp4?tag=10 1151660789956337664 video https://video.twimg.com/ext_tw_video/1151660789956337664/pu/vid/360x576/D-KmT16S0ioR9LVZ.mp4?tag=10 1151995772746944512 video https://video.twimg.com/ext_tw_video/1151995772746944512/pu/vid/360x640/5WC5dxfewU5B7emM.mp4?tag=10 1151988482492907520 video https://video.twimg.com/ext_tw_video/1151988482492907520/pu/vid/640x360/I4baL1gZEgz_A87D.mp4?tag=10 1151651329170976768 video https://video.twimg.com/ext_tw_video/1151651329170976768/pu/vid/640x360/I699VCwQYeR7t99b.mp4?tag=10 1151506558343372800 video https://video.twimg.com/ext_tw_video/1151506558343372800/pu/vid/360x456/Whqhxj0LoFcBMoOB.mp4?tag=10 1151638906212892672 video https://video.twimg.com/ext_tw_video/1151638906212892672/pu/vid/480x480/sjJsT1WpYUotjpVw.mp4?tag=10 1151996298712666112 video https://video.twimg.com/ext_tw_video/1151996298712666112/pu/vid/360x640/KsEq6fEivATHsTFk.mp4?tag=10 1150837429189906433 video https://video.twimg.com/ext_tw_video/1150837429189906433/pu/vid/360x576/kVN3DTLeq_-Sdo1c.mp4?tag=10 1151996049499627520 video https://video.twimg.com/ext_tw_video/1151996049499627520/pu/vid/360x634/leb-2_2MqfJLnQ4U.mp4?tag=10
--------------------------------------------------------------------------- KeyboardInterrupt Traceback (most recent call last) <ipython-input-7-c797093b266a> in <module> 12 if count%200000 == 0: 13 print(f"Done with: {count}") ---> 14 tweet = TweetAnalyzer(data_file.readline()) 15 if tweet.hasMedia: 16 # print("HasMedia",tweet.hasMedia) /home/rickyrenuncia/RickyRenuncia-case-module_shared/tweet_rehydrate/analysis.py in __init__(self, data) 95 self.data=data 96 if type(self.data) is str: ---> 97 self.data = json.loads(self.data) 98 self.extractMeta() 99 /usr/lib/python3.8/json/__init__.py in loads(s, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw) 355 parse_int is None and parse_float is None and 356 parse_constant is None and object_pairs_hook is None and not kw): --> 357 return _default_decoder.decode(s) 358 if cls is None: 359 cls = JSONDecoder /usr/lib/python3.8/json/decoder.py in decode(self, s, _w) 335 336 """ --> 337 obj, end = self.raw_decode(s, idx=_w(s, 0).end()) 338 end = _w(s, end).end() 339 if end != len(s): /usr/lib/python3.8/json/decoder.py in raw_decode(self, s, idx) 351 """ 352 try: --> 353 obj, end = self.scan_once(s, idx) 354 except StopIteration as err: 355 raise JSONDecodeError("Expecting value", s, err.value) from None KeyboardInterrupt:
