我正在尝试阅读一系列JSON文件并转换为Pandas DataFrame,但是,我所遵循的示例都没有用于阅读部分。
这是我拥有的JSON文件的示例:
{
"created_at": "Thu Nov 02 01:09:12 +0000 2017",
"text": "RT @coindesk: SEC: Celebrity ICO Endorsements Could Be Illegal gHoWduXOBp t.co/iyWla0Ryuk",
"tweet_id": 925892516087558145,
"user_id": 153962533,
"user_name": "Christine Duhaime"
}{
"created_at": "Thu Nov 02 01:09:44 +0000 2017",
"text": "Cornell Professor C t.co/RuNu6UQyr9",
"tweet_id": 925892650884108289,
"user_id": 1255045351,
"user_name": "Local SEO Somerset"
}
我试过了:
with codecs.open('./output/streamer_20171022-2010.json', 'r+', encoding='utf-8') as data_file:
data = json.load(data_file)
结果
JSONDecodeError: Extra data: line 1 column 416 (char 415)
我也尝试过逐行阅读......没有成功。
有什么想法吗?
答案 0 :(得分:1)
您的JSON文件格式无效。您只能在有效的JSON中有一个顶级元素
尝试将顶级对象放入数组中。
[
{ "created_at": "Thu Nov 02 01:09:12 +0000 2017",
"text": "RT @coindesk: SEC: Celebrity ICO Endorsements Could Be Illegal gHoWduXOBp t.co/iyWla0Ryuk",
"tweet_id": 925892516087558145,
"user_id": 153962533,
"user_name": "Christine Duhaime"
}, {
"created_at": "Thu Nov 02 01:09:44 +0000 2017",
"text": "Cornell Professor C t.co/RuNu6UQyr9",
"tweet_id": 925892650884108289,
"user_id": 1255045351,
"user_name": "Local SEO Somerset"
}
]