Elasticsearch版本:2.3.3
已安装插件:无插件
JVM版:1.8.0_91
操作系统版本:Linux版本3.19.0-56-generic(Ubuntu 4.8.2-19ubuntu1)
当我在多条路径上查询nested objects时,我得到了奇怪的结果。我想用female
搜索所有dementia
。结果中有匹配的患者。但是,我也得到了其他不能找到的诊断,诊断与这些患者相关。
例如,尽管我只查看了dementia
,但我也得到了以下诊断。
为什么?
我希望{strong}仅 female
dementia
并且不想要其他诊断。
Client_Demographic_Details
每名患者包含一份文件。 Diagnosis
每位患者包含多个文档。最终目标是将我的整个数据从PostgreSQL DB(总共72个表,over 1600 columns)索引到Elasticsearch中。
查询:
{'query': {
'bool': {
'must': [
{'nested': {
'path': 'Diagnosis',
'query': {
'bool': {
'must': [{'match_phrase': {'Diagnosis.Diagnosis': {'query': "dementia"}}}]
}
}
}},
{'nested': {
'path': 'Client_Demographic_Details',
'query': {
'bool': {
'must': [{'match_phrase': {'Client_Demographic_Details.Gender_Description': {'query': "female"}}}]
}
}
}}
]
}
}}
结果:
{
"hits": {
"hits": [
{
"_score": 3.4594634,
"_type": "Patient",
"_id": "72",
"_source": {
"Client_Demographic_Details": [
{
"Gender_Description": "Female",
"Patient_ID": 72,
}
],
"Diagnosis": [
{
"Diagnosis": "F00.0 - Dementia in Alzheimer's disease with early onset",
"Patient_ID": 72,
},
{
"Patient_ID": 72,
"Diagnosis": "F99.X - Mental disorder, not otherwise specified",
},
{
"Patient_ID": 72,
"Diagnosis": "I10.X - Essential (primary) hypertension",
}
]
},
"_index": "denorm1"
}
],
"total": 6,
"max_score": 3.4594634
},
"_shards": {
"successful": 5,
"failed": 0,
"total": 5
},
"took": 8,
"timed_out": false
}
映射:
{
"denorm1" : {
"aliases" : { },
"mappings" : {
"Patient" : {
"properties" : {
"Client_Demographic_Details" : {
"type" : "nested",
"properties" : {
"Patient_ID" : {
"type" : "long"
},
"Gender_Description" : {
"type" : "string"
}
}
},
"Diagnosis" : {
"type" : "nested",
"properties" : {
"Patient_ID" : {
"type" : "long"
},
"Diagnosis" : {
"type" : "string"
}
}
}
}
}
},
"settings" : {
"index" : {
"creation_date" : "1473974457603",
"number_of_shards" : "5",
"number_of_replicas" : "1",
"uuid" : "Jo9cI4kRQjeWcZ7WMB6ZAw",
"version" : {
"created" : "2030399"
}
}
},
"warmers" : { }
}
}
答案 0 :(得分:1)
试试这个
{
"_source": {
"exclude": [
"Client_Demographic_Details",
"Diagnosis"
]
},
"query": {
"bool": {
"must": [
{
"nested": {
"path": "Diagnosis",
"query": {
"bool": {
"must": [
{
"match_phrase": {
"Diagnosis.Diagnosis": {
"query": "dementia"
}
}
}
]
}
},
"inner_hits": {}
}
},
{
"nested": {
"path": "Client_Demographic_Details",
"query": {
"bool": {
"must": [
{
"match_phrase": {
"Client_Demographic_Details.Gender_Description": {
"query": "female"
}
}
}
]
}
},
"inner_hits": {}
}
}
]
}
}
}
嵌套的匹配文档将在inner hits
内,并在源代码中休息。
我知道这不是一个具体的方法
答案 1 :(得分:0)
正如@blackmamba建议的那样,我构建了以Client_Demographic_Details
为根对象和Diagnosis
作为嵌套对象的映射。
<强>映射:强>
{
"denorm2" : {
"aliases" : { },
"mappings" : {
"Patient" : {
"properties" : {
"BRC_ID" : {
"type" : "long"
},
"Diagnosis" : {
"type" : "nested",
"properties" : {
"BRC_ID" : {
"type" : "long"
},
"Diagnosis" : {
"type" : "string"
}
}
},
"Gender_Description" : {
"type" : "string"
}
}
}
},
"settings" : {
"index" : {
"creation_date" : "1474031740689",
"number_of_shards" : "5",
"number_of_replicas" : "1",
"uuid" : "fMeKa6sfThmxkg_281WdHA",
"version" : {
"created" : "2030399"
}
}
},
"warmers" : { }
}
}
<强>查询:强>
我添加了源过滤并突出显示。
{
'_source': {
'exclude': ['Diagnosis'],
'include': ['BRC_ID', 'Gender_Description']
},
'highlight': {
'fields': {
'Gender_Description': {}
}
},
'query': {
'bool': {
'must': [
{'nested': {
'path': 'Diagnosis',
'query': {
'bool': {
'must': [{'match_phrase': {'Diagnosis.Diagnosis': {'query': "dementia"}}}]
}
},
'inner_hits': {
'highlight': {
'fields': {
'Diagnosis.Diagnosis': {}
}
},
'_source': ['BRC_ID', 'Diagnosis']
}
}},
{'match_phrase': {'Gender_Description': {'query': "female"}}}
]
}
}}