如何显示弹性搜索中的所有记录。我能够从数据库Elastic Search中精确检索3条记录

时间:2017-04-27 03:59:41

标签: python elasticsearch

公司下有18条记录,但我只能看到3条记录。下面是我的查询和python代码。

{
      "query": {
        "nested": {
          "inner_hits": {
            "_source": [
              "name",
              "country",
              "_matched_experiences.role"
            ]
          },
          "path": "socials",
          "query": {
            "match": { "socials._has_email": "true"
            }
          }
        }
      },
      "_source": [
        "com_name"
      ]
      }

下面是我的python代码。

with open(OUTPUT_FILENAME_1, "a") as f1:
        csv_writer_1 = csv.writer(f1)
        csv_writer_1.writerow(["company_name","name","country","role"])
        query_dictionary = {above query}

        scroll = elasticsearch.helpers.scan(es, query=query_dictionary, index=companydirectory, scroll='60m', size=800)
        for res in scroll:
            try:
                record_fields = res["_source"]  
                name = ""
                com_name = ""
                company_name = record_fields.get("com_name")  #from the ES
                name_record_fields = res["inner_hits"]["social_contacts"]["hits"]["hits"]
                for j in name_record_fields:
                    name = j['_source']['name']    #from ES
                    k = j['_source']['_matched_experiences']
                    role = k[0].get('role')
                    country = j['_source']['country']
                    print company_name,name,validated_email_fromES, function_id,level_id, country, company_name,role
                    # csv_writer_1.writerow([company_name.encode('utf8'),name.encode('utf8'),country,role.encode('utf8')])  

            except Exception as e1:
                pass

这是ES的示例输出:

  "_source": {
    "company_name": "Rothborns"
    },
    "inner_hits": {
    "social_contacts": {
    "hits": {
    "total": 18,
    "max_score": 9.87977,
    "hits": [
    {
    "_type": "comp_directory",
    "_id": "MC9MY",
    "_nested": {
    "field": "socials",
    "offset": 36
    },
    "_score": 9.787,
    "_source": {
    "country": "SA",
    "name": "warner Pauli",
    "_has_email": true,
    "_matched_experiences": [
    {
    "role": "Financial Controller"
    }
    ]
    }

罗斯堡的总记录是18。但是我只能从输出文件中获得Rothborns的3条记录。

请帮助。谢谢。

1 个答案:

答案 0 :(得分:2)

原因是默认使用inner_hits the size is 3时。您只需将查询更改为:

{
  "query": {
    "nested": {
      "inner_hits": {
        "size": 100,                  <--- add this
        "_source": [
          "name",
          "country",
          "_matched_experiences.role"
        ]
      },
      "path": "socials",
      "query": {
        "match": {
          "socials._has_email": "true"
        }
      }
    }
  },
  "_source": [
    "com_name"
  ]
}