似乎无法使用twitter JSON数据集

时间:2018-05-01 10:30:11

标签: python json parsing twitter jupyter-notebook

首先,我是一个完全的初学者,如果这太容易或无足轻重,我道歉。

所以,我有一些来自archive.org(例如https://archive.org/details/archiveteam-twitter-stream-2017-01)的大型Twitter json数据集,我想在某些主题标签上进行过滤,并使用python进行一些可读性。截至目前,我似乎无法使用python或jupyter打开文件,并且似乎根本无法订购文件。

文件外观的一个示例:

{" created_at":" Sun Oct 22 06:30:00 +0000 2017"," id":921986981168422912," id_str&#34 ;:" 921986981168422912","文字":" RT @hypebizzle:\"告诉你的狗让我独自一人,这很烦人\&# 34; \ n \ n首先,离开我的房子",#34;来源":" \ u003ca href = \" http://twitter.com/download/ iphone \" rel = \" nofollow \" \ u003eTwitter for iPhone \ u003c / a \ u003e","截断":false," in_reply_to_status_id":null,& #34; in_reply_to_status_id_str":空," in_reply_to_user_id":空," in_reply_to_user_id_str":空," in_reply_to_screen_name":空,"使用者&#34 ;:{" ID":421547249" ID_STR":" 421547249""名称":"克里斯&#34 ;, " screen_name":" crisbeltran98"," location":" Cajeme,Sonora"," url":&#34 ; http://Instagram.com/cristinabeltraan" ;," description":" il futuro non \ u00e scritto // Lic.inPsicology在我的路上。 \\ \ u201cCristina saludos,un beso \" LFHP"" translator_type":"无""保护":假,"验证":假,&#34 ; FOLLOWERS_COUNT":1498," FRIENDS_COUNT":1383," listed_count":6," favourites_count":3174," statuses_count&#34 ;: 39135," created_at":" Sat 11月26日02:51:49 +0000 2011"," utc_offset": - 25200," time_zone" :"亚利桑那"" geo_enabled":真,"朗":" ES"" contributors_enabled":假, " is_translator":假," profile_background_color":" C0DEED"" profile_background_image_url":" HTTP://pbs.twimg。 COM / profile_background_images / 768201074 / 3b0047f4eb39cd54a3a82a2d62fa715a.png"" profile_background_image_url_https":" HTTPS://pbs.twimg.com/profile_background_images/768201074/3b0047f4eb39cd54a3a82a2d62fa715a.png",&#34 ; profile_background_tile":真," profile_link_color":" 000088"" profile_sidebar_border_color":" FFFFFF"" profile_ sidebar_fill_color":" DDEEF6"" profile_text_color":" 333333"" profile_use_background_image":真," profile_image_url&#34 ;:" HTTP://pbs.twimg.com/profile_images/919935822694047745/nm6uOnr3_normal.jpg"," profile_image_url_https":" HTTPS://pbs.twimg.com/ profile_images / 919935822694047745 / nm6uOnr3_normal.jpg"" profile_banner_url":" HTTPS://pbs.twimg.com/profile_banners/421547249/1508164767"," DEFAULT_PROFILE&#34 ;:假," default_profile_image":假,"以下":空," follow_request_sent":空,"通知":空},& #34;地理":空,"坐标":空,"地方":空,"贡献者":空," retweeted_status&#34 ;:

有谁知道要采取哪些步骤?我似乎无法在线找到解决方案。

1 个答案:

答案 0 :(得分:0)

欢迎使用Stack Overflow!你都尝试了些什么?当我在Python中打开JSON时,这就是我所做的:

import json
import pprint

df = json.load(open('YOUR JSON DATA'))
pprint(df)

完成此操作后,您可以通过执行以下操作来调用您的数据:

df[“created_at”]