我必须阅读line-json并从每行中提取密钥。最终,这将从ES的索引中删除。
但是,在读取文件时,提取的值是:
b'74298dcbd08507175b94fbe5c2a6a87d'
,而不是74298dcbd08507175b94fbe5c2a6a87d
。从文件读取行的代码是:
from elasticsearch import Elasticsearch, helpers
import json
es = Elasticsearch("a.b.c.d:9200")
delete_patch_destination = "delete.json"
index_name = "some_index"
with open(delete_patch_destination) as delete_json_file:
for line in delete_json_file:
# print(line)
line_content = json.loads(line)
# line_content = json.loads(line)
# for es_key in line_content.items():
for es_key in line_content.keys():
print (es_key)
# es.delete(index=index_name, doc_type="latest",id=es_key)
json文件由以下几行组成:
{"b'af2f9719a205f0ce9ae27c951e5b7037'": "\"b'af2f9719a205f0ce9ae27c951e5b7037'\""}
{"b'2b2781de47c70b11576a0f67bc59050a'": "\"b'2b2781de47c70b11576a0f67bc59050a'\""}
{"b'6cf97818c6b5c5a94b7d8dbb4cfcfe60'": "\"b'6cf97818c6b5c5a94b7d8dbb4cfcfe60'\""}
{"b'ceaf66243d3eb226859ee5ae7eacf86a'": "\"b'ceaf66243d3eb226859ee5ae7eacf86a'\""}
{"b'164a12ea5947e1f51566ee6939e20a2e'": "\"b'164a12ea5947e1f51566ee6939e20a2e'\""}
{"b'42e9bb704c424b49fb5e6adb68157e6f'": "\"b'42e9bb704c424b49fb5e6adb68157e6f'\""}
答案 0 :(得分:1)
将字符串解码为:
How to convert 'binary string' to normal string in Python3?
b'a_string'.decode('utf-8')
您将获得“ a_string”
答案 1 :(得分:1)
可以改进输入以避免这些卷积,但可以解决您的紧迫问题:
您的字典似乎由键和与值相同的数据组成(甚至更“字符串化,我们将忽略该部分”)
首先使用ast.literal_eval
求值,然后将密钥解码以转换为字符串:
>>> import ast
>>> s = "b'af2f9719a205f0ce9ae27c951e5b7037'"
>>> ast.literal_eval(s).decode()
'af2f9719a205f0ce9ae27c951e5b7037'
(与eval
相对,这种评估方法没有安全性问题:Using python's eval() vs. ast.literal_eval()?)