Question

在Solr中，我有3个核心：

Unicore
Core_1
Core_2

Unicore 是通用内核，具有 Core_1和Core_2

我正在获取以下查询的字符串 “ 50000912”

的结果

  http://localhost:8983/solr/UniCore/select?q=*text:"50000912"*&wt=json&indent=true&shards=http://localhost:8983/solr/Core_1,http://localhost:8983/solr/Core_2

输出：

"response":{"numFound":4,"start":0,"maxScore":10.04167,"docs":[

但是如果我通过在字符串的末尾删除“ 2”来传递“ 5000091”而不是“ 50000912”，则会得到零结果

输出：

"response":{"numFound":0,"start":0,"maxScore":0.0,"docs":[]

使用相同的查询从技术上应该返回更多结果，我是否缺少某些东西或其错误？谁能纠正我。

仅供参考，这是我从Core_2得到的数据之一

"response":{"numFound":4,"start":0,"maxScore":10.04167,"docs":[
  {
    "Storageloc_For_EP":"2500",
    "Material_Number":"50000912-001",
    "Maximum_LotSize":"0",
    "Totrepl_Leadtime":"3",
    "Prodstor_Location":"2000",
    "Country_Of_Origin":"CN",
    "Planned_Deliv_Time":"1",
    "Planning_Time_Fence":"0",
    "Plant":"5515",
    "GR_Processing_Time":"1",
    "Minimum_LotSize":"7920",
    "Rounding_Value":"720",
    "Service_Level_Days":"0",
    "id":"2716447",
    "Fixed_LotSize":"0",
    "Procurement_Type":"F",
    "Automatic_PO":"X",
    "SchedMargin_Key":"005",
    "Service_Level_Qty":"0",
    "MRP_Type":"ZB",
    "Profit_Center":"B2019",
    "_version_":1531317575416283139,
    "[shard]":"http://localhost:8983/solr/Core_2",
    "score":10.04167},
  {

Answer 1

Solr不会进行任何子字符串匹配，除非您使用了NgramFilter（它将为原始令牌的每个子字符串生成多个令牌）。

我的猜测是您已将内容索引到标准定义的文本字段中，这意味着它将在-上标记化。这意味着该文档在Material_Number的索引中存储的是50000912和001。 Solr仅在查询和索引端令牌都匹配时给出匹配。

您有几个选择-您可以添加EdgeNGramFilter，该字符串将从字符串的开头为每个字符组合生成单独的标记，或者由于这是一个数字值，因此可以使用字符串字段（而不是标记化的字段，除非它使用KeywordTokenizer），并在末尾使用通配符：q=Material_Number:5000091*将为您提供任何以5000091开头的 token 的文档。

Solr不会检索分片中部分字符串匹配的结果

1 个答案: