我每个推文的Twitter帐户时间轴数据都保存为.json格式,我无法将数据保存到mongodb
示例:获取一条推文的数据。
{
"created_at": "Fri Apr 12 05:13:35 +0000 2019",
"id": 1116570031511359489,
"id_str": "1116570031511359489",
"full_text": "@jurafsky How can i get your video lectures related to Sentiment Analysis",
"truncated": false,
"display_text_range": [0, 73],
"entities": {
"hashtags": [],
"symbols": [],
"user_mentions": [
{
"screen_name": "jurafsky",
"name": "Dan Jurafsky",
"id": 14968475,
"id_str": "14968475",
"indices": [0, 9]
}
],
"urls": []
}
它还包含网址和其他信息丢失
我尝试了以下代码。
from pymongo import MongoClient
import json
client=MongoClient('localhost',27107)
db=client.test
coll=db.dataset
with open('tweets.json') as f:
file_data=json.loads(f.read())
coll.insert(file_data)
client.close()
答案 0 :(得分:1)
尝试一下:
from pymongo import MongoClient
import json
client=MongoClient('localhost',27107)
db=client.test
coll=db.dataset
with open('tweets.json') as f:
file_data=json.load(f)
coll.insert(file_data)
client.close()
答案 1 :(得分:0)
我的json数据集无效,我必须将其合并到一个数组对象
感谢:Can't parse json file: json.decoder.JSONDecodeError: Extra data.