将字典的字符串表示形式转换为实际字典

时间:2019-07-17 21:26:44

标签: python json python-3.x dictionary

我有一个很大的json文件,我将其打开并存储在名为data的变量中。打印data的一小段代码给了我以下字符串类型的输出:

'{"_index": "11_20190714_184325_01", "_type": "11", "_id": "1feb78aff16852ed", "_score": 0.0, "fields": {"c_u": ["hvhprecision.com"], "tawgs.id": ["p10813", "p449", "p6426", "p6427"]}}{"_index": "11_20190714_184325_01", "_type": "11", "_id": "786fd4ad2415aa7b", "_score": 0.0, "fields": {"c_u": ["thomsonreuters.com"], "tawgs.id": ["p12519", "p510", "p6426"]}}{"_index": "11_20190714_184325_01", "_type": "11", "_id": "5826e7cbd92d951a", "_score": 0.0, "fields": {"tawgs.id": ["p12505", "p18053", "p6426", "p816", "p826", "p8453", "p8458"]}}'

我需要将字典的字符串表示形式转换为实际的字典,以使其扁平化并创建数据框。

但是,当我尝试时:

import ast
ast.literal_eval(data)

我收到一个“无效语法” 错误。

尝试此代码:

with open("es-output.json", "r") as f
    dictionary =  json.loads(f.read())

给我这个错误:

JSONDecodeError: Extra data: line 1 column 184 (char 183)

做一个简单的事情:

json.loads(data)

还会输出与上述相同的错误:JSONDecodeError: Extra data: line 1 column 184 (char 183)

我不知道为什么不能将字符串转换为字典,特别是使用 ast 库的情况。

请帮助并提前谢谢您!

1 个答案:

答案 0 :(得分:0)

此字符串

data = '{"_index": "11_20190714_184325_01", "_type": "11", "_id": "1feb78aff16852ed", "_score": 0.0, "fields": {"c_u": ["hvhprecision.com"], "tawgs.id": ["p10813", "p449", "p6426", "p6427"]}}{"_index": "11_20190714_184325_01", "_type": "11", "_id": "786fd4ad2415aa7b", "_score": 0.0, "fields": {"c_u": ["thomsonreuters.com"], "tawgs.id": ["p12519", "p510", "p6426"]}}{"_index": "11_20190714_184325_01", "_type": "11", "_id": "5826e7cbd92d951a", "_score": 0.0, "fields": {"tawgs.id": ["p12505", "p18053", "p6426", "p816", "p826", "p8453", "p8458"]}}'

包含三个单独的JSON对象。您不能从三个字典中创建一个。

您需要先将它们分开,然后通过ast运行它们。您可以使用replace

data = data.replace("}{", "},{")
ast.literal_eval(data)

这将返回一个包含您各种字典的元组:

({'_index': '11_20190714_184325_01', '_type': '11', '_id': '1feb78aff16852ed', '_score': 0.0, 'fields': {'c_u': ['hvhprecision.com'], 'tawgs.id': ['p10813', 'p449', 'p6426', 'p6427']}}, {'_index': '11_20190714_184325_01', '_type': '11', '_id': '786fd4ad2415aa7b', '_score': 0.0, 'fields': {'c_u': ['thomsonreuters.com'], 'tawgs.id': ['p12519', 'p510', 'p6426']}}, {'_index': '11_20190714_184325_01', '_type': '11', '_id': '5826e7cbd92d951a', '_score': 0.0, 'fields': {'tawgs.id': ['p12505', 'p18053', 'p6426', 'p816', 'p826', 'p8453', 'p8458']}})