elasticsearch:我无法检索json文档的hit.from

时间:2017-06-26 08:41:03

标签: python json elasticsearch

我正在尝试打印查询结果,并且我发现我无法检索hit.from文档的json。 它告诉我SyntaxError: invalid syntaxfrom是代表电子邮件的json doc的字段。

我在这里发布查询代码:

client = Elasticsearch()

query_string = raw_input("Enter your query string: ")
print(query_string)

s = Search(using=client, index="enron_test")
    .query("match", message_body=query_string)

response = s.execute()

这里是打印结果的代码:

for hit in response:
    print(hit.from)
    print(hit.to)
    print(hit.message_body)
    print(line + "\n")

奇怪的是,from以灰色突出显示。

以下是json doc:

 {
            "_index": "enron_test",
            "_type": "enron_type",
            "_id": "AVzjqJ_-Jz9fMnhSpbdm",
            "_score": 1.7868237,
            "_source": {
               "message_body": "'Arry (calxa@aol.com), Lime Ex-splurt Extraordinaire...
               "content-transfer-encoding": "7bit",
               "file_name": "2.",
               "sub_holder": "",
               "subject": "Re: History of  Lime and Cement",
               "x-cc": "strawbale@crest.org",
               "from": "rob_tom@freenet.carleton.ca",
               "x-folder": "\\Phillip_Allen_Dec2000\\Notes Folders\\Straw",
               "content_size_in_bytes": 1811,
               "to": "calxa@aol.com",
               "x-origin": "Allen-P",
               "mime-version": "1.0",
               "x-bcc": "",
               "x-filename": "pallen.nsf",
               "date": "Thu, 17 Feb 2000 07:37:00 -0800 (PST)",
               "x-to": "CALXA@aol.com",
               "loaded_on": "2017-06-26",
               "cc": "strawbale@crest.org",
               "bcc": "strawbale@crest.org",
               "x-from": "rob_tom@freenet.carleton.ca (Robert W. Tom)",
               "message-id": "<22208447.1075855692838.JavaMail.evans@thyme>",
               "content-type": "text/plain; charset=us-ascii"
            }
         },

1 个答案:

答案 0 :(得分:1)

Elasticsearch结果包含许多其他字段,为您提供有关该文档以及内容本身的更多信息。

{ "took": 1, "timed_out": false, "_shards":{ "total" : 1, "successful" : 1, "failed" : 0 }, "hits":{ "total" : 1, "max_score": 1.3862944, "hits" : [ { "_index" : "twitter", "_type" : "tweet", "_id" : "0", "_score": 1.3862944, "_source" : { "user" : "kimchy", "message": "trying out Elasticsearch", "date" : "2009-11-15T14:12:12", "likes" : 0 } } ] } }

因此,为了从实际文档中获取from字段,您需要执行以下操作:

for hit in response['hits']['hits']: print(hit['_source']['from'])