如何从Python3中的line-json文件获取实际字符串?

时间:2018-12-18 10:30:37

标签: json python-3.x

我必须阅读l​​ine-json并从每行中提取密钥。最终,这将从ES的索引中删除。

但是,在读取文件时,提取的值是: b'74298dcbd08507175b94fbe5c2a6a87d',而不是74298dcbd08507175b94fbe5c2a6a87d。从文件读取行的代码是:

from elasticsearch import Elasticsearch, helpers
import json

es = Elasticsearch("a.b.c.d:9200")
delete_patch_destination = "delete.json"
index_name = "some_index"

with open(delete_patch_destination) as delete_json_file:
    for line in delete_json_file:
        # print(line)
        line_content = json.loads(line)
        # line_content = json.loads(line)
        # for es_key in line_content.items():
        for es_key in line_content.keys():
            print (es_key)
            # es.delete(index=index_name, doc_type="latest",id=es_key)

json文件由以下几行组成:

{"b'af2f9719a205f0ce9ae27c951e5b7037'": "\"b'af2f9719a205f0ce9ae27c951e5b7037'\""}
{"b'2b2781de47c70b11576a0f67bc59050a'": "\"b'2b2781de47c70b11576a0f67bc59050a'\""}
{"b'6cf97818c6b5c5a94b7d8dbb4cfcfe60'": "\"b'6cf97818c6b5c5a94b7d8dbb4cfcfe60'\""}
{"b'ceaf66243d3eb226859ee5ae7eacf86a'": "\"b'ceaf66243d3eb226859ee5ae7eacf86a'\""}
{"b'164a12ea5947e1f51566ee6939e20a2e'": "\"b'164a12ea5947e1f51566ee6939e20a2e'\""}
{"b'42e9bb704c424b49fb5e6adb68157e6f'": "\"b'42e9bb704c424b49fb5e6adb68157e6f'\""}

2 个答案:

答案 0 :(得分:1)

将字符串解码为:

How to convert 'binary string' to normal string in Python3?

b'a_string'.decode('utf-8')

您将获得“ a_string”

答案 1 :(得分:1)

可以改进输入以避免这些卷积,但可以解决您的紧迫问题:

您的字典似乎由键和与值相同的数据组成(甚至更“字符串化,我们将忽略该部分”)

首先使用ast.literal_eval求值,然后将密钥解码以转换为字符串:

>>> import ast
>>> s = "b'af2f9719a205f0ce9ae27c951e5b7037'"
>>> ast.literal_eval(s).decode()
'af2f9719a205f0ce9ae27c951e5b7037'

(与eval相对,这种评估方法没有安全性问题:Using python's eval() vs. ast.literal_eval()?