我有这些" json"我想插入我的mongodb数据库的文件。
一个例子是: http://s.live.ksmobile.net/cheetahlive/de/ff/15201023827214369775/15201023827214369775.json
问题是,它形成如下:
{ "channelType":"TEMPGROUP", ... } # line 1
{ "channelType":"TEMPGROUP", ... } # line 2
因此,它不是将其作为1个文档插入数据库中,而是将每一行插入1个条目。最终应该是3" json"数据库中的文件代替了数据库中的1189个文档。
如何插入" .json"的全部内容?一个文件?
我的代码是:
replay_url = "http://live.ksmobile.net/live/getreplayvideos?"
userid = 969730808384462848
url2 = replay_url + urllib.parse.urlencode({'userid': userid}) + '&page_size=1000'
raw_replay_data = requests.get(url2).json()
for i in raw_replay_data['data']['video_info']:
url3 = i['msgfile']
raw_message_data = urllib.request.urlopen(url3)
for line in raw_message_data:
json_data = json.loads(line)
messages.insert_one(json_data)
print(json_data)
更新以提供更多信息以便回答
messages.insert(json_data)给出了这个错误:
Traceback (most recent call last):
File "/media/anon/06bcf743-8b4d-409f-addc-520fc4e19299/PycharmProjects/LiveMe/venv1/lib/python3.6/site-packages/pymongo/collection.py", line 633, in _insert
blk.execute(concern, session=session)
File "/media/anon/06bcf743-8b4d-409f-addc-520fc4e19299/PycharmProjects/LiveMe/venv1/lib/python3.6/site-packages/pymongo/bulk.py", line 432, in execute
return self.execute_command(generator, write_concern, session)
File "/media/anon/06bcf743-8b4d-409f-addc-520fc4e19299/PycharmProjects/LiveMe/venv1/lib/python3.6/site-packages/pymongo/bulk.py", line 329, in execute_command
raise BulkWriteError(full_result)
pymongo.errors.BulkWriteError: batch op errors occurred
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/media/anon/06bcf743-8b4d-409f-addc-520fc4e19299/PycharmProjects/LiveMe/import_messages_dev.py", line 43, in <module>
messages.insert(json_data)
File "/media/anon/06bcf743-8b4d-409f-addc-520fc4e19299/PycharmProjects/LiveMe/venv1/lib/python3.6/site-packages/pymongo/collection.py", line 2941, in insert
check_keys, manipulate, write_concern)
File "/media/anon/06bcf743-8b4d-409f-addc-520fc4e19299/PycharmProjects/LiveMe/venv1/lib/python3.6/site-packages/pymongo/collection.py", line 635, in _insert
_raise_last_error(bwe.details)
File "/media/anon/06bcf743-8b4d-409f-addc-520fc4e19299/PycharmProjects/LiveMe/venv1/lib/python3.6/site-packages/pymongo/helpers.py", line 220, in _raise_last_error
_raise_last_write_error(write_errors)
File "/media/anon/06bcf743-8b4d-409f-addc-520fc4e19299/PycharmProjects/LiveMe/venv1/lib/python3.6/site-packages/pymongo/helpers.py", line 188, in _raise_last_write_error
raise DuplicateKeyError(error.get("errmsg"), 11000, error)
pymongo.errors.DuplicateKeyError: E11000 duplicate key error index: liveme.messages.$_id_ dup key: { : ObjectId('5aa2fc6f5d60126499060949') }
messages.insert_one(json_data)给了我这个错误:
Traceback (most recent call last):
File "/media/anon/06bcf743-8b4d-409f-addc-520fc4e19299/PycharmProjects/LiveMe/import_messages_dev.py", line 43, in <module>
messages.insert_one(json_data)
File "/media/anon/06bcf743-8b4d-409f-addc-520fc4e19299/PycharmProjects/LiveMe/venv1/lib/python3.6/site-packages/pymongo/collection.py", line 676, in insert_one
common.validate_is_document_type("document", document)
File "/media/anon/06bcf743-8b4d-409f-addc-520fc4e19299/PycharmProjects/LiveMe/venv1/lib/python3.6/site-packages/pymongo/common.py", line 434, in validate_is_document_type
"collections.MutableMapping" % (option,))
TypeError: document must be an instance of dict, bson.son.SON, bson.raw_bson.RawBSONDocument, or a type that inherits from collections.MutableMapping
messages.insert_many(json_data)给了我这个错误:
Traceback (most recent call last):
File "/media/anon/06bcf743-8b4d-409f-addc-520fc4e19299/PycharmProjects/LiveMe/import_messages_dev.py", line 43, in <module>
messages.insert_many(json_data)
File "/media/anon/06bcf743-8b4d-409f-addc-520fc4e19299/PycharmProjects/LiveMe/venv1/lib/python3.6/site-packages/pymongo/collection.py", line 742, in insert_many
blk.execute(self.write_concern.document, session=session)
File "/media/anon/06bcf743-8b4d-409f-addc-520fc4e19299/PycharmProjects/LiveMe/venv1/lib/python3.6/site-packages/pymongo/bulk.py", line 432, in execute
return self.execute_command(generator, write_concern, session)
File "/media/anon/06bcf743-8b4d-409f-addc-520fc4e19299/PycharmProjects/LiveMe/venv1/lib/python3.6/site-packages/pymongo/bulk.py", line 329, in execute_command
raise BulkWriteError(full_result)
pymongo.errors.BulkWriteError: batch op errors occurred
messages.insert和messages.insert_many都插入1行并抛出错误。
答案 0 :(得分:1)
这些文件显然不包含格式正确的json - 而是每行包含一个单独的对象。
要将它们转换为有效的json,您可能需要一个对象列表,即:
[{ "channelType":"TEMPGROUP", ... },
{ "channelType":"TEMPGROUP", ... }]
您可以通过以下方式实现此目的:
for i in raw_replay_data['data']['video_info']:
url3 = i['msgfile']
raw_message_data = urllib.request.urlopen(url3)
json_data = []
for line in raw_message_data:
json_data.append(json.loads(line))
messages.insert_one(json_data)
print(json_data)