ElasticSearch和NEST - 如何构造简单的OR查询?

时间:2014-11-12 02:45:27

标签: elasticsearch nest

我正在开发一个构建存储库查询。

以下是我要写的查询。

  

zipCode上的完全匹配)AND((不区分大小写)   匹配address1)OR(siteName上的不区分大小写的完全匹配))

在我的存储库中,我有一个类似于以下内容的文档:

  

address1:4 Myrtle Street
  siteName:桃金娘街   zipCode:90210

我一直在接受比赛:

  

address1:4 5 Myrtle Street
  siteName:桃金娘
  zipCode:90210

以下是一些没有效果的尝试:

{
  "query": {
    "bool": {
      "must": [
        {
          "bool": {
            "should": [
              {
                "term": {
                  "address1": {
                    "value": "45 myrtle street"
                  }
                }
              },
              {
                "term": {
                  "siteName": {
                    "value": "myrtle"
                  }
                }
              }
            ]
          }
        },
        {
          "term": {
            "zipCode": {
              "value": "90210"
            }
          }
        }
      ]
    }
  }
}


{
  "query": {
    "filtered": {
      "query": {
        "term": {
          "zipCode": {
            "value": "90210"
          }
        }
      },
      "filter": {
        "or": {
          "filters": [
            {
              "term": {
                "address1": "45 myrtle street"
              }
            },
            {
              "term": {
                "siteName": "myrtle"
              }
            }
          ]
        }
      }
    }
  }
}




{
  "filter": {
    "bool": {
      "must": [
        {
          "or": {
            "filters": [
              {
                "term": {
                  "address1": "45 myrtle street"
                }
              },
              {
                "term": {
                  "siteName": "myrtle"
                }
              }
            ]
          }
        },
        {
          "term": {
            "zipCode": "90210"
          }
        }
      ]
    }
  }
}




{
  "query": {
    "bool": {
      "must": [
        {
          "span_or": {
            "clauses": [
              {
                "span_term": {
                  "siteName": {
                    "value": "myrtle"
                  }
                }
              }
            ]
          }
        },
        {
          "term": {
            "zipCode": {
              "value": "90210"
            }
          }
        }
      ]
    }
  }
}



{
  "query": {
    "filtered": {
      "query": {
        "term": {
          "zipCode": {
            "value": "90210"
          }
        }
      },
      "filter": {
        "or": {
          "filters": [
            {
              "term": {
                "address1": "45 myrtle street"
              }
            },
            {
              "term": {
                "siteName": "myrtle"
              }
            }
          ]
        }
      }
    }
  }
}

有谁知道我做错了什么?

我用NEST写这个,所以我更喜欢NEST语法,但ElasticSearch语法肯定也足够了。

编辑:根据@Greg Marzouka的评论,以下是映射:

{
   [indexname]: {
      "mappings": {
         "[indexname]elasticsearchresponse": {
            "properties": {
               "address": {
                  "type": "string"
               },
               "address1": {
                  "type": "string"
               },
               "address2": {
                  "type": "string"
               },
               "address3": {
                  "type": "string"
               },
               "city": {
                  "type": "string"
               },
               "country": {
                  "type": "string"
               },
               "id": {
                  "type": "string"
               },
               "originalSourceId": {
                  "type": "string"
               },
               "placeId": {
                  "type": "string"
               },
               "siteName": {
                  "type": "string"
               },
               "siteType": {
                  "type": "string"
               },
               "state": {
                  "type": "string"
               },
               "systemId": {
                  "type": "long"
               },
               "zipCode": {
                  "type": "string"
               }
            }
         }
      }
   }
}

1 个答案:

答案 0 :(得分:3)

根据您的映射,您将无法在siteName上搜索完全匹配,因为它正在使用standard analyzer进行分析,{{3}}更适合全文搜索。这是Elasticsearch在未在字段上显式定义时应用的默认分析器。

标准分析器将siteName的值分解为多个令牌。例如,Myrtle Street被标记化并存储为索引myrtlestreet中的两个单独的术语,这就是您的查询与该文档匹配的原因。对于不区分大小写的完全匹配,您希望将Myrtle Street存储为索引中的单个低级标记:myrtle street

您可以将siteName设置为not_analyzed,这样就不会将字段置于分析链中 - 这意味着不会修改值。但是,这将生成一个Myrtle Street令牌,该令牌可用于完全匹配,但会区分大小写。

您需要做的是使用关键字标记器和小写标记过滤器创建自定义分析器,然后将其应用于您的字段。

以下是使用NEST流畅API实现此目的的方法:

// Create the custom analyzer using the keyword tokenizer and lowercase token filter
var myAnalyzer = new CustomAnalyzer
{
    Tokenizer = "keyword",
    Filter = new [] { "lowercase" }
};

var response = this.Client.CreateIndex("your-index-name", c => c
    // Add the customer analyzer to your index settings
    .Analysis(an => an
        .Analyzers(az => az
            .Add("my_analyzer", myAnalyzer)
        )
    )
    // Create the mapping for your type and apply "my_analyzer" to the siteName field
    .AddMapping<YourType>(m => m
        .MapFromAttributes()
        .Properties(ps => ps
            .String(s => s.Name(t => t.SiteName).Analyzer("my_analyzer"))
        )
    )
);