公司下有18条记录,但我只能看到3条记录。下面是我的查询和python代码。
{
"query": {
"nested": {
"inner_hits": {
"_source": [
"name",
"country",
"_matched_experiences.role"
]
},
"path": "socials",
"query": {
"match": { "socials._has_email": "true"
}
}
}
},
"_source": [
"com_name"
]
}
下面是我的python代码。
with open(OUTPUT_FILENAME_1, "a") as f1:
csv_writer_1 = csv.writer(f1)
csv_writer_1.writerow(["company_name","name","country","role"])
query_dictionary = {above query}
scroll = elasticsearch.helpers.scan(es, query=query_dictionary, index=companydirectory, scroll='60m', size=800)
for res in scroll:
try:
record_fields = res["_source"]
name = ""
com_name = ""
company_name = record_fields.get("com_name") #from the ES
name_record_fields = res["inner_hits"]["social_contacts"]["hits"]["hits"]
for j in name_record_fields:
name = j['_source']['name'] #from ES
k = j['_source']['_matched_experiences']
role = k[0].get('role')
country = j['_source']['country']
print company_name,name,validated_email_fromES, function_id,level_id, country, company_name,role
# csv_writer_1.writerow([company_name.encode('utf8'),name.encode('utf8'),country,role.encode('utf8')])
except Exception as e1:
pass
这是ES的示例输出:
"_source": {
"company_name": "Rothborns"
},
"inner_hits": {
"social_contacts": {
"hits": {
"total": 18,
"max_score": 9.87977,
"hits": [
{
"_type": "comp_directory",
"_id": "MC9MY",
"_nested": {
"field": "socials",
"offset": 36
},
"_score": 9.787,
"_source": {
"country": "SA",
"name": "warner Pauli",
"_has_email": true,
"_matched_experiences": [
{
"role": "Financial Controller"
}
]
}
罗斯堡的总记录是18。但是我只能从输出文件中获得Rothborns的3条记录。
请帮助。谢谢。
答案 0 :(得分:2)
原因是默认使用inner_hits
the size is 3时。您只需将查询更改为:
{
"query": {
"nested": {
"inner_hits": {
"size": 100, <--- add this
"_source": [
"name",
"country",
"_matched_experiences.role"
]
},
"path": "socials",
"query": {
"match": {
"socials._has_email": "true"
}
}
}
},
"_source": [
"com_name"
]
}