Solr结果首先显示带连字符的单词

时间:2016-08-08 09:53:00

标签: solr scoring hyphenation

我有以下问题:我搜索术语并获得结果。一切都好。如果一个术语在solr索引中作为带连字符的单词存在,则包含该单词的结果将始终获得更高的分数/将显示在结果的顶部。

我已经尝试更改搜索的第三个结果条目,并且不将带连字符的麦芽汁更改为带连字符的麦芽汁。在重新索引文档并搜索相同的术语之后,我会期望得到与以前相同的得分。但是我改变这个词的文件现在是第一位的。

文本字段类型在我的schema.xml中看起来如下:

 <fieldType name="text" class="solr.TextField" sortMissingLast="true"  positionIncrementGap="100">
   <analyzer>
      <tokenizer class="solr.WhitespaceTokenizerFactory" />
      <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords-de.txt" />
      <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" splitOnCaseChange="0" splitOnNumerics="0" catenateWords="1" catenateNumbers="0" catenateAll="1" stemEnglishPossessive="1" preserveOriginal="1" />
      <filter class="solr.GermanNormalizationFilterFactory" />
      <filter class="solr.LowerCaseFilterFactory" />
      <filter class="solr.WordDelimiterFilterFactory" catenateAll="1" preserveOriginal="1" />
   </analyzer>
 </fieldType>

有谁知道为什么会导致不同的结果?非常感谢任何帮助。

更新: 在单词连字之前,我执行了“Meyer”的搜索查询。我得到了以下结果:

<lst name="debug">
  <str name="rawquerystring">Meyer</str>
  <str name="querystring">Meyer</str>
  <str name="parsedquery">(+DisjunctionMaxQuery((content:meyer | title:meyer | keywords:meyer | h1:meyer | description:meyer | browsertitle:meyer^3)))/no_coord</str>
  <str name="parsedquery_toString">+(content:meyer | title:meyer | keywords:meyer | h1:meyer | description:meyer | browsertitle:meyer^3)</str>
  <lst name="explain">
    <str name="ID1">
2.1717649 = max of:
  0.471918 = weight(content:meyer in 26) [DefaultSimilarity], result of:
    0.471918 = score(doc=26,freq=4.0), product of:
      0.32961872 = queryWeight, product of:
        5.726835 = idf(docFreq=15, maxDocs=1807)
        0.057556875 = queryNorm
      1.4317087 = fieldWeight in 26, product of:
        2.0 = tf(freq=4.0), with freq of:
          4.0 = termFreq=4.0
        5.726835 = idf(docFreq=15, maxDocs=1807)
        0.125 = fieldNorm(doc=26)
  0.9652289 = weight(title:meyer in 26) [DefaultSimilarity], result of:
    0.9652289 = score(doc=26,freq=1.0), product of:
      0.33333334 = queryWeight, product of:
        5.7913733 = idf(docFreq=14, maxDocs=1807)
        0.057556875 = queryNorm
      2.8956866 = fieldWeight in 26, product of:
        1.0 = tf(freq=1.0), with freq of:
          1.0 = termFreq=1.0
        5.7913733 = idf(docFreq=14, maxDocs=1807)
        0.5 = fieldNorm(doc=26)
  0.9652289 = weight(description:meyer in 26) [DefaultSimilarity], result of:
    0.9652289 = score(doc=26,freq=1.0), product of:
      0.33333334 = queryWeight, product of:
        5.7913733 = idf(docFreq=14, maxDocs=1807)
        0.057556875 = queryNorm
      2.8956866 = fieldWeight in 26, product of:
        1.0 = tf(freq=1.0), with freq of:
          1.0 = termFreq=1.0
        5.7913733 = idf(docFreq=14, maxDocs=1807)
        0.5 = fieldNorm(doc=26)
  2.1717649 = weight(browserTitle:meyer^3.0 in 26) [DefaultSimilarity], result of:
    2.1717649 = fieldWeight in 26, product of:
      1.0 = tf(freq=1.0), with freq of:
        1.0 = termFreq=1.0
      5.7913733 = idf(docFreq=14, maxDocs=1807)
      0.375 = fieldNorm(doc=26)
</str>
    <str name="ID2">
2.1717649 = max of:
  0.471918 = weight(content:meyer in 222) [DefaultSimilarity], result of:
    0.471918 = score(doc=222,freq=4.0), product of:
      0.32961872 = queryWeight, product of:
        5.726835 = idf(docFreq=15, maxDocs=1807)
        0.057556875 = queryNorm
      1.4317087 = fieldWeight in 222, product of:
        2.0 = tf(freq=4.0), with freq of:
          4.0 = termFreq=4.0
        5.726835 = idf(docFreq=15, maxDocs=1807)
        0.125 = fieldNorm(doc=222)
  0.9652289 = weight(title:meyer in 222) [DefaultSimilarity], result of:
    0.9652289 = score(doc=222,freq=1.0), product of:
      0.33333334 = queryWeight, product of:
        5.7913733 = idf(docFreq=14, maxDocs=1807)
        0.057556875 = queryNorm
      2.8956866 = fieldWeight in 222, product of:
        1.0 = tf(freq=1.0), with freq of:
          1.0 = termFreq=1.0
        5.7913733 = idf(docFreq=14, maxDocs=1807)
        0.5 = fieldNorm(doc=222)
  0.9652289 = weight(description:meyer in 222) [DefaultSimilarity], result of:
    0.9652289 = score(doc=222,freq=1.0), product of:
      0.33333334 = queryWeight, product of:
        5.7913733 = idf(docFreq=14, maxDocs=1807)
        0.057556875 = queryNorm
      2.8956866 = fieldWeight in 222, product of:
        1.0 = tf(freq=1.0), with freq of:
          1.0 = termFreq=1.0
        5.7913733 = idf(docFreq=14, maxDocs=1807)
        0.5 = fieldNorm(doc=222)
  2.1717649 = weight(browserTitle:meyer^3.0 in 222) [DefaultSimilarity], result of:
    2.1717649 = fieldWeight in 222, product of:
      1.0 = tf(freq=1.0), with freq of:
        1.0 = termFreq=1.0
      5.7913733 = idf(docFreq=14, maxDocs=1807)
      0.375 = fieldNorm(doc=222)
</str>
    <str name="ID3">
2.1717649 = max of:
  0.471918 = weight(content:meyer in 234) [DefaultSimilarity], result of:
    0.471918 = score(doc=234,freq=4.0), product of:
      0.32961872 = queryWeight, product of:
        5.726835 = idf(docFreq=15, maxDocs=1807)
        0.057556875 = queryNorm
      1.4317087 = fieldWeight in 234, product of:
        2.0 = tf(freq=4.0), with freq of:
          4.0 = termFreq=4.0
        5.726835 = idf(docFreq=15, maxDocs=1807)
        0.125 = fieldNorm(doc=234)
  0.9652289 = weight(title:meyer in 234) [DefaultSimilarity], result of:
    0.9652289 = score(doc=234,freq=1.0), product of:
      0.33333334 = queryWeight, product of:
        5.7913733 = idf(docFreq=14, maxDocs=1807)
        0.057556875 = queryNorm
      2.8956866 = fieldWeight in 234, product of:
        1.0 = tf(freq=1.0), with freq of:
          1.0 = termFreq=1.0
        5.7913733 = idf(docFreq=14, maxDocs=1807)
        0.5 = fieldNorm(doc=234)
  0.9652289 = weight(description:meyer in 234) [DefaultSimilarity], result of:
    0.9652289 = score(doc=234,freq=1.0), product of:
      0.33333334 = queryWeight, product of:
        5.7913733 = idf(docFreq=14, maxDocs=1807)
        0.057556875 = queryNorm
      2.8956866 = fieldWeight in 234, product of:
        1.0 = tf(freq=1.0), with freq of:
          1.0 = termFreq=1.0
        5.7913733 = idf(docFreq=14, maxDocs=1807)
        0.5 = fieldNorm(doc=234)
  2.1717649 = weight(browserTitle:meyer^3.0 in 234) [DefaultSimilarity], result of:
    2.1717649 = fieldWeight in 234, product of:
      1.0 = tf(freq=1.0), with freq of:
        1.0 = termFreq=1.0
      5.7913733 = idf(docFreq=14, maxDocs=1807)
      0.375 = fieldNorm(doc=234)
</str>
</lst>

然后我将第3个结果从“Meyer”更改为“Meyer-Landrut”,重新编制索引并再次执行搜索结果:

<lst name="debug">
  <str name="rawquerystring">Meyer</str>
  <str name="querystring">Meyer</str>
  <str name="parsedquery">(+DisjunctionMaxQuery((content:meyer | title:meyer | keywords:meyer | h1:meyer | description:meyer | browsertitle:meyer^3)))/no_coord</str>
  <str name="parsedquery_toString">+(content:meyer | title:meyer | keywords:meyer | h1:meyer | description:meyer | browsertitle:meyer^3)</str>
  <lst name="explain">
    <str name="ID3">
2.5594494 = max of:
  0.5276203 = weight(content:meyer in 1767) [DefaultSimilarity], result of:
    0.5276203 = score(doc=1767,freq=5.0), product of:
      0.32961872 = queryWeight, product of:
        5.726835 = idf(docFreq=15, maxDocs=1807)
        0.057556875 = queryNorm
      1.600699 = fieldWeight in 1767, product of:
        2.236068 = tf(freq=5.0), with freq of:
          5.0 = termFreq=5.0
        5.726835 = idf(docFreq=15, maxDocs=1807)
        0.125 = fieldNorm(doc=1767)
  1.0237797 = weight(title:meyer in 1767) [DefaultSimilarity], result of:
    1.0237797 = score(doc=1767,freq=2.0), product of:
      0.33333334 = queryWeight, product of:
        5.7913733 = idf(docFreq=14, maxDocs=1807)
        0.057556875 = queryNorm
      3.0713391 = fieldWeight in 1767, product of:
        1.4142135 = tf(freq=2.0), with freq of:
          2.0 = termFreq=2.0
        5.7913733 = idf(docFreq=14, maxDocs=1807)
        0.375 = fieldNorm(doc=1767)
  1.1944097 = weight(description:meyer in 1767) [DefaultSimilarity], result of:
    1.1944097 = score(doc=1767,freq=2.0), product of:
      0.33333334 = queryWeight, product of:
        5.7913733 = idf(docFreq=14, maxDocs=1807)
        0.057556875 = queryNorm
      3.583229 = fieldWeight in 1767, product of:
        1.4142135 = tf(freq=2.0), with freq of:
          2.0 = termFreq=2.0
        5.7913733 = idf(docFreq=14, maxDocs=1807)
        0.4375 = fieldNorm(doc=1767)
  2.5594494 = weight(browserTitle:meyer^3.0 in 1767) [DefaultSimilarity], result of:
    2.5594494 = fieldWeight in 1767, product of:
      1.4142135 = tf(freq=2.0), with freq of:
        2.0 = termFreq=2.0
      5.7913733 = idf(docFreq=14, maxDocs=1807)
      0.3125 = fieldNorm(doc=1767)
</str>
    <str name="ID4">
2.1717649 = max of:
  0.40869296 = weight(content:meyer in 286) [DefaultSimilarity], result of:
    0.40869296 = score(doc=286,freq=3.0), product of:
      0.32961872 = queryWeight, product of:
        5.726835 = idf(docFreq=15, maxDocs=1807)
        0.057556875 = queryNorm
      1.239896 = fieldWeight in 286, product of:
        1.7320508 = tf(freq=3.0), with freq of:
          3.0 = termFreq=3.0
        5.726835 = idf(docFreq=15, maxDocs=1807)
        0.125 = fieldNorm(doc=286)
  0.9652289 = weight(title:meyer in 286) [DefaultSimilarity], result of:
    0.9652289 = score(doc=286,freq=1.0), product of:
      0.33333334 = queryWeight, product of:
        5.7913733 = idf(docFreq=14, maxDocs=1807)
        0.057556875 = queryNorm
      2.8956866 = fieldWeight in 286, product of:
        1.0 = tf(freq=1.0), with freq of:
          1.0 = termFreq=1.0
        5.7913733 = idf(docFreq=14, maxDocs=1807)
        0.5 = fieldNorm(doc=286)
  0.9652289 = weight(description:meyer in 286) [DefaultSimilarity], result of:
    0.9652289 = score(doc=286,freq=1.0), product of:
      0.33333334 = queryWeight, product of:
        5.7913733 = idf(docFreq=14, maxDocs=1807)
        0.057556875 = queryNorm
      2.8956866 = fieldWeight in 286, product of:
        1.0 = tf(freq=1.0), with freq of:
          1.0 = termFreq=1.0
        5.7913733 = idf(docFreq=14, maxDocs=1807)
        0.5 = fieldNorm(doc=286)
  2.1717649 = weight(browserTitle:meyer^3.0 in 286) [DefaultSimilarity], result of:
    2.1717649 = fieldWeight in 286, product of:
      1.0 = tf(freq=1.0), with freq of:
        1.0 = termFreq=1.0
      5.7913733 = idf(docFreq=14, maxDocs=1807)
      0.375 = fieldNorm(doc=286)
</str>
    <str name="ID5">
2.1717649 = max of:
  0.40869296 = weight(content:meyer in 436) [DefaultSimilarity], result of:
    0.40869296 = score(doc=436,freq=3.0), product of:
      0.32961872 = queryWeight, product of:
        5.726835 = idf(docFreq=15, maxDocs=1807)
        0.057556875 = queryNorm
      1.239896 = fieldWeight in 436, product of:
        1.7320508 = tf(freq=3.0), with freq of:
          3.0 = termFreq=3.0
        5.726835 = idf(docFreq=15, maxDocs=1807)
        0.125 = fieldNorm(doc=436)
  0.9652289 = weight(title:meyer in 436) [DefaultSimilarity], result of:
    0.9652289 = score(doc=436,freq=1.0), product of:
      0.33333334 = queryWeight, product of:
        5.7913733 = idf(docFreq=14, maxDocs=1807)
        0.057556875 = queryNorm
      2.8956866 = fieldWeight in 436, product of:
        1.0 = tf(freq=1.0), with freq of:
          1.0 = termFreq=1.0
        5.7913733 = idf(docFreq=14, maxDocs=1807)
        0.5 = fieldNorm(doc=436)
  0.9652289 = weight(description:meyer in 436) [DefaultSimilarity], result of:
    0.9652289 = score(doc=436,freq=1.0), product of:
      0.33333334 = queryWeight, product of:
        5.7913733 = idf(docFreq=14, maxDocs=1807)
        0.057556875 = queryNorm
      2.8956866 = fieldWeight in 436, product of:
        1.0 = tf(freq=1.0), with freq of:
          1.0 = termFreq=1.0
        5.7913733 = idf(docFreq=14, maxDocs=1807)
        0.5 = fieldNorm(doc=436)
  2.1717649 = weight(browserTitle:meyer^3.0 in 436) [DefaultSimilarity], result of:
    2.1717649 = fieldWeight in 436, product of:
      1.0 = tf(freq=1.0), with freq of:
        1.0 = termFreq=1.0
      5.7913733 = idf(docFreq=14, maxDocs=1807)
      0.375 = fieldNorm(doc=436)
</str>

...

    <str name="ID1">
2.1717649 = max of:
  0.471918 = weight(content:meyer in 1174) [DefaultSimilarity], result of:
    0.471918 = score(doc=1174,freq=4.0), product of:
      0.32961872 = queryWeight, product of:
        5.726835 = idf(docFreq=15, maxDocs=1807)
        0.057556875 = queryNorm
      1.4317087 = fieldWeight in 1174, product of:
        2.0 = tf(freq=4.0), with freq of:
          4.0 = termFreq=4.0
        5.726835 = idf(docFreq=15, maxDocs=1807)
        0.125 = fieldNorm(doc=1174)
  0.9652289 = weight(title:meyer in 1174) [DefaultSimilarity], result of:
    0.9652289 = score(doc=1174,freq=1.0), product of:
      0.33333334 = queryWeight, product of:
        5.7913733 = idf(docFreq=14, maxDocs=1807)
        0.057556875 = queryNorm
      2.8956866 = fieldWeight in 1174, product of:
        1.0 = tf(freq=1.0), with freq of:
          1.0 = termFreq=1.0
        5.7913733 = idf(docFreq=14, maxDocs=1807)
        0.5 = fieldNorm(doc=1174)
  0.9652289 = weight(description:meyer in 1174) [DefaultSimilarity], result of:
    0.9652289 = score(doc=1174,freq=1.0), product of:
      0.33333334 = queryWeight, product of:
        5.7913733 = idf(docFreq=14, maxDocs=1807)
        0.057556875 = queryNorm
      2.8956866 = fieldWeight in 1174, product of:
        1.0 = tf(freq=1.0), with freq of:
          1.0 = termFreq=1.0
        5.7913733 = idf(docFreq=14, maxDocs=1807)
        0.5 = fieldNorm(doc=1174)
  2.1717649 = weight(browserTitle:meyer^3.0 in 1174) [DefaultSimilarity], result of:
    2.1717649 = fieldWeight in 1174, product of:
      1.0 = tf(freq=1.0), with freq of:
        1.0 = termFreq=1.0
      5.7913733 = idf(docFreq=14, maxDocs=1807)
      0.375 = fieldNorm(doc=1174)
</str>
    <str name="ID2">
2.1717649 = max of:
  0.471918 = weight(content:meyer in 1766) [DefaultSimilarity], result of:
    0.471918 = score(doc=1766,freq=4.0), product of:
      0.32961872 = queryWeight, product of:
        5.726835 = idf(docFreq=15, maxDocs=1807)
        0.057556875 = queryNorm
      1.4317087 = fieldWeight in 1766, product of:
        2.0 = tf(freq=4.0), with freq of:
          4.0 = termFreq=4.0
        5.726835 = idf(docFreq=15, maxDocs=1807)
        0.125 = fieldNorm(doc=1766)
  0.9652289 = weight(title:meyer in 1766) [DefaultSimilarity], result of:
    0.9652289 = score(doc=1766,freq=1.0), product of:
      0.33333334 = queryWeight, product of:
        5.7913733 = idf(docFreq=14, maxDocs=1807)
        0.057556875 = queryNorm
      2.8956866 = fieldWeight in 1766, product of:
        1.0 = tf(freq=1.0), with freq of:
          1.0 = termFreq=1.0
        5.7913733 = idf(docFreq=14, maxDocs=1807)
        0.5 = fieldNorm(doc=1766)
  0.9652289 = weight(description:meyer in 1766) [DefaultSimilarity], result of:
    0.9652289 = score(doc=1766,freq=1.0), product of:
      0.33333334 = queryWeight, product of:
        5.7913733 = idf(docFreq=14, maxDocs=1807)
        0.057556875 = queryNorm
      2.8956866 = fieldWeight in 1766, product of:
        1.0 = tf(freq=1.0), with freq of:
          1.0 = termFreq=1.0
        5.7913733 = idf(docFreq=14, maxDocs=1807)
        0.5 = fieldNorm(doc=1766)
  2.1717649 = weight(browserTitle:meyer^3.0 in 1766) [DefaultSimilarity], result of:
    2.1717649 = fieldWeight in 1766, product of:
      1.0 = tf(freq=1.0), with freq of:
        1.0 = termFreq=1.0
      5.7913733 = idf(docFreq=14, maxDocs=1807)
      0.375 = fieldNorm(doc=1766)
</str>
</lst>

更改单词后,突然出现在地点1和2之前的结果,现在出现在结果列表的末尾。似乎队列已更改,现在与第一个结果相比,它们位于同一行的末尾。怎么可能?我如何使这些结果更随机,以便新的带连字符的单词不会出现在列表的顶部,但就像在第三个地方的第一次搜索一样?

0 个答案:

没有答案