Solr索引多个值作为一个字段

时间:2013-11-08 15:10:26

标签: solr lucene indexing

我想对双字段进行查询,其中四个值实际上是分组的,每个文档可以有多个这样的实例。所以我需要的是一个我可以存储这样的东西的领域

<doc>
   <field name="id">id</field>
   <field name="valueGroup">1 2 3 4</field>
   <field name="valueGroup">5 6 7 8</field>
</doc>

然后以这种方式进行远程查询:valueGroup:[0,0,0,0到3,8,8,8]。我不能将此索引作为单值字段使用multivalued =“true”,因为每个组都需要单独处理。我知道有一个字段型LatLon,但它只有两个值。如何获得超过2维的字段?

1 个答案:

答案 0 :(得分:0)

正如我在回答您对我的SO问题的评论中提到的那样,我对执行一些复杂的过滤也有很小的要求。最后,我必须创建一个自定义字段类,它允许我覆盖负责返回包含自定义逻辑的查询对象以过滤结果的方法。这种方法应该非常适合你:

public class MyCustomFieldType extends FieldType {
    /**
     * {@inheritDoc}
     */
    @Override
    protected void init(final IndexSchema schema, final Map<String, String> args) {
        trueProperties |= TOKENIZED;
        super.init(schema, args);
    }

    /**
     * {@inheritDoc}
     */
    @Override
    public void write(final XMLWriter xmlWriter, final String name, final Fieldable fieldable)
        throws IOException
    {
        xmlWriter.writeStr(name, fieldable.stringValue());
    }

    /**
     * {@inheritDoc}
     */
    @Override
    public void write(final TextResponseWriter writer, final String name, final Fieldable fieldable)
        throws  IOException
    {
        writer.writeStr(name, fieldable.stringValue(), true);
    }

    /**
     * {@inheritDoc}
     */
    @Override
    public SortField getSortField(final SchemaField field, final boolean reverse) {
        return getStringSort(field, reverse);
    }

    /**
     * {@inheritDoc}
     */
    @Override
    public void setAnalyzer(final Analyzer analyzer) {
        this.analyzer = analyzer;
    }

    /**
     * {@inheritDoc}
     */
    @Override
    public void setQueryAnalyzer(final Analyzer queryAnalyzer) {
        this.queryAnalyzer = queryAnalyzer;
    }

    /**
     * {@inheritDoc}
     */
    @Override
    public Query getFieldQuery(
        final QParser parser, final SchemaField field, final String externalVal)
    {
        // Do some parsing of the user's input (if necessary) from the query string (externalVal)
        final String parsedInput = ...

        // Instantiate your custom filter, taking note to wrap it in a caching wrapper!
        final Filter filter = new CachingWrapperFilter(
            new MyCustomFilter(field, parsedValue));

        // Return a query that runs your filter against all docs in the index
        // NOTE: depending on your needs, you may be able to do a more fine grained query here
        // instead of a MatchAllDocsQuery!!
        return new FilteredQuery(new MatchAllDocsQuery(), filter);
    }
}

现在您需要一个自定义过滤器......

public class MyCustomFilter extends Filter {
    /**
     * The field that is being filtered.
     */
    private final SchemaField field;

    /**
     *  The value to filter against.
     */
    private final String filterBy;

    /**
     * 
     *
     * @param field     The field to perform filtering against.
     * @param filterBy  A value to filter by.
     */
    public ProgrammeAvailabilityFilter(
        final SchemaField field,
        final String filterBy)
    {
        this.field = field;
        this.filterBy = filterBy;
    }

    /**
     * {@inheritDoc}
     */
    @Override
    public DocIdSet getDocIdSet(final IndexReader reader) throws IOException {

        final FixedBitSet bitSet = new FixedBitSet(reader.maxDoc());

        // find all the docs you want to run the filter against
        final Weight weight = new IndexSearcher(reader).createNormalizedWeight(
            new SOME_QUERY_TYPE_HERE());

        final Scorer docIterator = weight.scorer(reader, true, false);

        if (docIterator == null) {
            return bitSet;
        }

        int docId;

        while ((docId = docIterator.nextDoc()) != Scorer.NO_MORE_DOCS) {

            final Document doc = reader.document(docId);

            for (final String indexFieldValue : doc.getValues(field.getName())) {
                // CUSTOM LOGIC GOES HERE

                // If your criteria are met, consider the doc a match
                bitSet.set(docId);
            }
        }

        return bitSet;
    }

    /**
     * {@inheritDoc}
     */
    @Override
    public boolean equals(final Object other) {
        // NEEDED FOR CACHING
    }

    /**
     * {@inheritDoc}
     */
    @Override
    public int hashCode() {
        // NEEDED FOR CACHING
    }
}

上面的示例显然是非常基本的,但是如果您将其用作模板并进行调整以提高性能并添加自定义逻辑,那么您应该得到所需的内容。另外,请务必在过滤器中实施hashCodeequals方法,因为这些方法将用于缓存。在查询字符串中,您可以像这样提供fq参数:`?q = some query&amp; fq = myfield:[0,0,0,0 to 3,8,8,8]。

正如我所提到的,这种方法对我和我的团队来说非常有用,因为我们对过滤内容有非常具体的要求。

祝你好运。 :)