使用Python进行数据检索和查询Elasticsearch

时间:2019-01-14 13:13:03

标签: python elasticsearch data-retrieval

我正在尝试使用Elasticsearchpython获取数据。我能够使用python连接到数据源。

连接到Elasticsearch

from elasticsearch import Elasticsearch

try:
  es = Elasticsearch(
      ['https:/movies-elk-prod-elastic-ssl.pac.com'],
      http_auth=('xxxxx', 'xxxx'),
      port=10202,
  )
  print ("Connected", es.info())
except Exception as ex:
  print ("Error:", ex)
es.info(pretty=True)

输入数据

   {
                "_index": "movies",
                "_type": "movie",
                "_id": "4444169",
                "_score": null,
                "_source": {
                    "actor": [
                        "Josh Duhamel",
                        "Megan Fox",
                        "Shia LaBeouf"
                    ],
                    "director": [
                        "Michael Bay"
                    ],
                    "full_text": "When Sam Witwicky learns the truth about the ancient origins of the Transformers, he must accept his fate and merge with Optimus Prime and Bumblebee in their epic battle against the Decepticons, who are stronger back than ever and plan to destroy our world.!",
                    "title": "Transformers: Revenge of the Fallen",
                    "type": "movie"
                },
                "sort": [
                    1544310000000,
                    4.05
                ]
            },
            {
                "_index": "movies",
                "_type": "movie",
                "_id": "4051",
                "_score": null,
                "_source": {
                    "actor": [
                        "Josh Duhamel",
                        "Shia LaBeouf",
                        "Patrick Dempsey"
                    ],
                    "director": [
                        "Michael Bay"
                    ],
                    "full_text": "A mysterious event from the Earth's past threatens to unleash such a devastating war that the Transformers can not possibly save the planet on its own. Sam Witwicky and the Autobots fight the darkness to defend our world against the devastating evil of the Decepticons.",
                    "title": "Transformers: Dark of the Moon",
                    "type": "movie"
                },
                "sort": [
                    1544310000000,
                    4.03949
                ]
            },

接下来,我想编写一个python函数,该函数基于actor来查询elasticsearch,例如假设我写了Josh Duhamel。它应该给我所有包含Josh Duhamel的电影。

下一步,我想将数据转换为python数据帧。我根据发现的函数尝试了一些方法,但是对我不起作用(请注意。我是Elasticsearch和python-:)的新手)

def search(uri, term):
    """Simple Elasticsearch Query"""
    query = json.dumps({
        "query": {
            "match": {
                "content": term
            }
        }
    })
    response = requests.get(uri, data=query)
    results = json.loads(response.text)
    return results

def format_results(results):
    """Print results nicely:
    doc_id) content
    """
    data = [doc for doc in results['hits']['hits']]
    for doc in data:
        print("%s) %s" % (doc['_id'], doc['_source']['content'])

def create_doc(uri, doc_data={}):
    """Create new document."""
    query = json.dumps(doc_data)
    response = requests.post(uri, data=query)
    print(response)

from elasticsearch import Elasticsearch

es = Elasticsearch()
res = es.search(index="test", doc_type="articles", body={"query": {"match": {"content": "fox"}}})
print("%d documents found" % res['hits']['total'])
for doc in res['hits']['hits']:
    print("%s) %s" % (doc['_id'], doc['_source']['content']))

感谢您的帮助和见解。预先感谢

1 个答案:

答案 0 :(得分:0)

您发布的代码段是在ES中查找名为“内容”的字段,为了使此工作能够实现,您需要将内容部分更改为文档中所需的字段,例如“演员”。在其余代码中寻找“内容”字段的其他部分也是如此。

 "query": {
        "match": {
            "actor": term
        }
    }