Solr按字段值和内部最新日期提升查询

时间:2016-05-10 14:36:11

标签: sorting solr lucene relevance solr-boost

我们在schema.xml中有以下设置:

<field name="last_modified" type="date" indexed="true" stored="true" multiValued="false" omitTermFreqAndPositions="true"/>
...

<field name="prefix" type="string" indexed="true" stored="true" omitTermFreqAndPositions="true"/>

我们的目标是按

对文档进行排序
  1. 前缀= 9999,最新文档(最后修改)首先
  2. 前缀= 1004或前缀= 1005,最新文档(最后修改)首先
  3. 我们的代码:

    {!boost b=recip(ms(NOW,last_modified),3.16e11,1,1)}prefix:9999^1000000 OR {!boost b=recip(ms(NOW,last_modified),3.16e-11,1,1)}prefix:1004^600000 OR {!boost b=recip(ms(NOW,last_modified),3.16e-11,1,1)}prefix:1005^600000
    

    结果: 上面的查询无法按预期工作!

    我们认为omitTermFreqAndPositions = true会强制阻止ITF并且评分应该有效。但它似乎并非如此! 请帮助我们: - )

1 个答案:

答案 0 :(得分:2)

So we found a solution!

  1. Create your own Similarity (a simple java class) For a better and simpler descriptions how, please read How to compile a custom similarity class for SOLR / Lucene using Eclipse

The class we used

package com.luxactive;
import org.apache.lucene.index.FieldInvertState;
import org.apache.lucene.search.similarities.DefaultSimilarity;

public class MyNewSimilarityClass  extends DefaultSimilarity {

@Override
public float coord(int overlap, int maxOverlap) {
    return 1.0f;
}

@Override
public float idf(long docFreq, long numDocs) {
    return 1.0f;
}

@Override
public float lengthNorm(FieldInvertState arg0) {
    return 1.0f;
}

@Override
public float tf(float freq) {
    return 1.0f;
}

}
  1. Create a simple jar with your Similarity
  2. Copy the jar to any folder into your solr server, we used: SOLRFOLDER/solr-4.8.0/example/solr/dih

The next steps need to be done to every collection you have!

  1. Edit the solrconfig.xml at: SOLRFOLDER/solr-4.8.0/example/solr/collection/conf/solrconfig.xml
    Add <lib dir="../dih" regex=".*\.jar" /> to import the custom jar
  2. Edit the schema.xml in the same folder

Add the following

<!-- DEFAULT Factory for custom com.luxactive.MyNewSimilarityClass  -->
<similarity class="solr.SchemaSimilarityFactory"/>

<!-- TYPE String -->
 <fieldType name="no_term_frequency_string" class="solr.StrField" sortMissingLast="true" >
    <similarity class="com.luxactive.MyNewSimilarityClass"/>
</fieldType>

<!-- TYPE Date -->
<fieldType name="no_term_frequency_date" class="solr.TrieDateField" sortMissingLast="true" >
    <similarity class="com.luxactive.MyNewSimilarityClass"/>
</fieldType>

<!-- TYPE Int-->
<fieldType name="no_term_frequency_int" class="solr.TrieIntField" sortMissingLast="true" >
    <similarity class="com.luxactive.MyNewSimilarityClass"/>
</fieldType>

Here you define your own field types (int, string and date) that use the new Similarity class which will return a boost value like defined in the MyNewSimilarityClass.

  1. Now edit the fields you want to use your custom Similaritry by setting theyr type to one you created.
    From: <field name="last_modified" type="date" indexed="true" stored="true" multiValued="false" />
    To: <field name="last_modified" type="no_term_frequency_date" indexed="true" stored="true" multiValued="false" />
  2. Restart the solr server and enjoy your boosting :)