Solr布尔查询与索引时间提升相结合

时间:2011-07-09 22:35:48

标签: solr

我有一个使用Solr 1.4.1的网站用于相关性/推荐。我在某些地方使用布尔式查询。我使用+(+type:aoh_company +aoh_dictionary_tids:623)之类的查询 - 它提供了预期的结果,但结果的顺序似乎是任意的。

我试图通过设置索引时间提升来控制文档的排名,但这些查询似乎忽略了它们。

一个例子

  • 查询网址为http://localhost:4930/solr/prod/select?rows=5&start=0&q.alt=(type%3Aaoh_company)+(aoh_dictionary_tids%3A623)&q=
  • 结果按此顺序返回(括号中的索引时间提升值):
    1. 17132(1.22)
    2. 17179(1.02)
    3. 17131(1.10)
    4. 17133(1.10)
    5. 17184(1.10)
  • 显然,结果#2不应该基于单独的提升而来到#3-5之前。
  • 鉴于这是一个布尔查询,排名应该没有太大区别。

调试输出

我尝试通过将debugQuery=true附加到查询来调试上述查询,因此它变为http://localhost:4930/solr/prod/select?rows=5&start=0&q.alt=(type%3Aaoh_company)+(aoh_dictionary_tids%3A623)&q=&debugQuery=true

这是非常冗长的,但现在是:

<lst name="debug">
  <null name="rawquerystring"/>
  <null name="querystring"/>
  <str name="parsedquery">+(+type:aoh_company +aoh_dictionary_tids:623)</str>
  <str name="parsedquery_toString">+(+type:aoh_company +aoh_dictionary_tids:623)</str>
  <lst name="explain">
    <str name="50hves/node/17132">
    1.7819747 = (MATCH) sum of:
      0.9014403 = (MATCH) weight(type:aoh_company in 1805), product of:
        0.37135038 = queryWeight(type:aoh_company), product of:
          2.4274657 = idf(docFreq=457, maxDocs=1909)
          0.15297863 = queryNorm
        2.4274657 = (MATCH) fieldWeight(type:aoh_company in 1805), product of:
          1.0 = tf(termFreq(type:aoh_company)=1)
          2.4274657 = idf(docFreq=457, maxDocs=1909)
          1.0 = fieldNorm(field=type, doc=1805)
      0.88053435 = (MATCH) weight(aoh_dictionary_tids:623 in 1805), product of:
        0.9284928 = queryWeight(aoh_dictionary_tids:623), product of:
          6.069428 = idf(docFreq=11, maxDocs=1909)
          0.15297863 = queryNorm
        0.9483481 = (MATCH) fieldWeight(aoh_dictionary_tids:623 in 1805), product of:
          1.0 = tf(termFreq(aoh_dictionary_tids:623)=1)
          6.069428 = idf(docFreq=11, maxDocs=1909)
          0.15625 = fieldNorm(field=aoh_dictionary_tids, doc=1805)
    </str>
    <str name="50hves/node/17179">
    1.7819747 = (MATCH) sum of:
      0.9014403 = (MATCH) weight(type:aoh_company in 1896), product of:
        0.37135038 = queryWeight(type:aoh_company), product of:
          2.4274657 = idf(docFreq=457, maxDocs=1909)
          0.15297863 = queryNorm
        2.4274657 = (MATCH) fieldWeight(type:aoh_company in 1896), product of:
          1.0 = tf(termFreq(type:aoh_company)=1)
          2.4274657 = idf(docFreq=457, maxDocs=1909)
          1.0 = fieldNorm(field=type, doc=1896)
      0.88053435 = (MATCH) weight(aoh_dictionary_tids:623 in 1896), product of:
        0.9284928 = queryWeight(aoh_dictionary_tids:623), product of:
          6.069428 = idf(docFreq=11, maxDocs=1909)
          0.15297863 = queryNorm
        0.9483481 = (MATCH) fieldWeight(aoh_dictionary_tids:623 in 1896), product of:
          1.0 = tf(termFreq(aoh_dictionary_tids:623)=1)
          6.069428 = idf(docFreq=11, maxDocs=1909)
          0.15625 = fieldNorm(field=aoh_dictionary_tids, doc=1896)
    </str>
    <str name="50hves/node/17131">
    1.7819747 = (MATCH) sum of:
      0.9014403 = (MATCH) weight(type:aoh_company in 1905), product of:
        0.37135038 = queryWeight(type:aoh_company), product of:
          2.4274657 = idf(docFreq=457, maxDocs=1909)
          0.15297863 = queryNorm
        2.4274657 = (MATCH) fieldWeight(type:aoh_company in 1905), product of:
          1.0 = tf(termFreq(type:aoh_company)=1)
          2.4274657 = idf(docFreq=457, maxDocs=1909)
          1.0 = fieldNorm(field=type, doc=1905)
      0.88053435 = (MATCH) weight(aoh_dictionary_tids:623 in 1905), product of:
        0.9284928 = queryWeight(aoh_dictionary_tids:623), product of:
          6.069428 = idf(docFreq=11, maxDocs=1909)
          0.15297863 = queryNorm
        0.9483481 = (MATCH) fieldWeight(aoh_dictionary_tids:623 in 1905), product of:
          1.0 = tf(termFreq(aoh_dictionary_tids:623)=1)
          6.069428 = idf(docFreq=11, maxDocs=1909)
          0.15625 = fieldNorm(field=aoh_dictionary_tids, doc=1905)
    </str>
    <str name="50hves/node/17133">
    1.7819747 = (MATCH) sum of:
      0.9014403 = (MATCH) weight(type:aoh_company in 1906), product of:
        0.37135038 = queryWeight(type:aoh_company), product of:
          2.4274657 = idf(docFreq=457, maxDocs=1909)
          0.15297863 = queryNorm
        2.4274657 = (MATCH) fieldWeight(type:aoh_company in 1906), product of:
          1.0 = tf(termFreq(type:aoh_company)=1)
          2.4274657 = idf(docFreq=457, maxDocs=1909)
          1.0 = fieldNorm(field=type, doc=1906)
      0.88053435 = (MATCH) weight(aoh_dictionary_tids:623 in 1906), product of:
        0.9284928 = queryWeight(aoh_dictionary_tids:623), product of:
          6.069428 = idf(docFreq=11, maxDocs=1909)
          0.15297863 = queryNorm
        0.9483481 = (MATCH) fieldWeight(aoh_dictionary_tids:623 in 1906), product of:
          1.0 = tf(termFreq(aoh_dictionary_tids:623)=1)
          6.069428 = idf(docFreq=11, maxDocs=1909)
          0.15625 = fieldNorm(field=aoh_dictionary_tids, doc=1906)
    </str>
    <str name="50hves/node/17184">
    1.6058679 = (MATCH) sum of:
      0.9014403 = (MATCH) weight(type:aoh_company in 1892), product of:
        0.37135038 = queryWeight(type:aoh_company), product of:
          2.4274657 = idf(docFreq=457, maxDocs=1909)
          0.15297863 = queryNorm
        2.4274657 = (MATCH) fieldWeight(type:aoh_company in 1892), product of:
          1.0 = tf(termFreq(type:aoh_company)=1)
          2.4274657 = idf(docFreq=457, maxDocs=1909)
          1.0 = fieldNorm(field=type, doc=1892)
      0.7044275 = (MATCH) weight(aoh_dictionary_tids:623 in 1892), product of:
        0.9284928 = queryWeight(aoh_dictionary_tids:623), product of:
          6.069428 = idf(docFreq=11, maxDocs=1909)
          0.15297863 = queryNorm
        0.7586785 = (MATCH) fieldWeight(aoh_dictionary_tids:623 in 1892), product of:
          1.0 = tf(termFreq(aoh_dictionary_tids:623)=1)
          6.069428 = idf(docFreq=11, maxDocs=1909)
          0.125 = fieldNorm(field=aoh_dictionary_tids, doc=1892)
    </str>
  </lst>
  <str name="QParser">DisMaxQParser</str>
  <str name="altquerystring">org.apache.lucene.search.BooleanQuery:+type:aoh_company +aoh_dictionary_tids:623</str>
  <null name="boostfuncs"/>
  <lst name="timing">
    <double name="time">7.0</double>
    <lst name="prepare">
      <double name="time">1.0</double>
      <lst name="org.apache.solr.handler.component.QueryComponent">
        <double name="time">0.0</double>
      </lst>
      <lst name="org.apache.solr.handler.component.FacetComponent">
      <double name="time">0.0</double>
    </lst>
    <lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
    <double name="time">0.0</double>
  </lst>
  <lst name="org.apache.solr.handler.component.HighlightComponent">
    <double name="time">0.0</double>
  </lst>
  <lst name="org.apache.solr.handler.component.StatsComponent">
    <double name="time">0.0</double>
  </lst>
  <lst name="org.apache.solr.handler.component.SpellCheckComponent">
    <double name="time">0.0</double>
  </lst>
  <lst name="org.apache.solr.handler.component.DebugComponent">
    <double name="time">0.0</double>
  </lst>
  </lst>
  <lst name="process">
    <double name="time">6.0</double>
    <lst name="org.apache.solr.handler.component.QueryComponent">
      <double name="time">0.0</double>
    </lst>
    <lst name="org.apache.solr.handler.component.FacetComponent">
      <double name="time">0.0</double>
    </lst>
    <lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
      <double name="time">0.0</double>
    </lst>
    <lst name="org.apache.solr.handler.component.HighlightComponent">
      <double name="time">0.0</double>
    </lst>
    <lst name="org.apache.solr.handler.component.StatsComponent">
      <double name="time">0.0</double>
    </lst>
    <lst name="org.apache.solr.handler.component.SpellCheckComponent">
      <double name="time">0.0</double>
    </lst>
    <lst name="org.apache.solr.handler.component.DebugComponent">
      <double name="time">6.0</double>
    </lst>
  </lst>
</lst>

当我读到它时,前四个结果得分为1.7819747,第五个得分为1.6058679,我无法在那里的任何地方看到提升值,所以看起来它们是不是排名方程中的一个因素。

所以我做错了什么。有什么我需要做的事情让Solr考虑到提升吗? 有没有办法检查存储在Solr中的增压值?它在我发送给它的文件中看起来是正确的,但是我找不到查看存储值的方法吗?

此外,这是我schema.xml的相关部分:

<types>
  <fieldType name="string" class="solr.StrField" sortMissingLast="true" omitNorms="true"/>
  <fieldType name="integer" class="solr.IntField" omitNorms="true"/>
</types>
<fields>
  <field name="type" type="string" indexed="true" stored="true"/>
  <field name="aoh_dictionary_tids"  type="integer" indexed="true" stored="true" multiValued="true" omitNorms="false"/>
</fields>

在下面的回答中,fyr提到需要在字段上启用规范才能应用提升值。所以我想稍微修改一下我的问题:

  • 是否足以在其中一个查询字段上启用规范以便应用提升?
  • 字段上的omitNorms="false"是否覆盖了fieldType上的omitNorms="true"

非常感谢任何帮助。

1 个答案:

答案 0 :(得分:0)

你不会在解释中看到提升。索引时间的提升应用于特定文档中某个字段的规范。就像一个乘数。

如果启用了规范,则在索引时使用您的bosst值。如果使用DefaultSimilarity并且启用了Norms,则规范始终是相似性函数的一部分。

编辑后续问题:

  1. 启用规范以启用提升就足够了。因为规范为索引中的字段提供了索引中的数据权重结构。并且索引时间增加乘以标准值并保存到标准字段。

  2. 字段声明中的
  3. omitNorms会覆盖类型定义 - 您的解释结构也会看到这一点。 aoh_dictionary的值不等于1.如果禁用了规范,则默认应用1。