我使用solr的facet模式来查找重复项。它工作得很好,但我无法弄清楚如何获取对象id。
>>> from haystack.query import SearchQuerySet
>>> sqs = SearchQuerySet().facet('text_string', limit=-1)
>>> sqs.facet_counts()
{
'dates': {},
'fields': {
'text_string': [
('the red ballon', 4),
('my grand pa is an alien', 2),
('be kind rewind', 12),
],
},
'queries': {}
}
我怎样才能获得我的对象的id'红色气球','我的爷爷是外星人'等等,我是否必须在solr的schema.xml中添加id字段?
我期待这样的事情:
>>> sqs.facet_counts()
{
'dates': {},
'fields': {
'text_string': [
(object_id, 'the red ballon', 4),
(object_id, 'my grand pa is an alien', 2),
(object_id, 'be kind rewind', 12),
],
},
'queries': {}
}
编辑:添加了schema.xml和search_indexes.py
solr的schema.xml
...
<fields>
<!-- general -->
<field name="id" type="string" indexed="true" stored="true" multiValued="false" required="true"/>
<field name="django_ct" type="string" indexed="true" stored="true" multiValued="false"/>
<field name="django_id" type="string" indexed="true" stored="true" multiValued="false"/>
<field name="_version_" type="long" indexed="true" stored ="true"/>
<dynamicField name="*_i" type="int" indexed="true" stored="true"/>
<dynamicField name="*_s" type="string" indexed="true" stored="true"/>
<dynamicField name="*_l" type="long" indexed="true" stored="true"/>
<dynamicField name="*_t" type="text_en" indexed="true" stored="true"/>
<dynamicField name="*_b" type="boolean" indexed="true" stored="true"/>
<dynamicField name="*_f" type="float" indexed="true" stored="true"/>
<dynamicField name="*_d" type="double" indexed="true" stored="true"/>
<dynamicField name="*_dt" type="date" indexed="true" stored="true"/>
<dynamicField name="*_p" type="location" indexed="true" stored="true"/>
<dynamicField name="*_coordinate" type="tdouble" indexed="true" stored="false"/>
<field name="text" type="text_en" indexed="true" stored="true" multiValued="false" termVectors="true" />
<field name="title" type="text_en" indexed="true" stored="true" multiValued="false" />
<!-- Used for duplicate content detection -->
<copyField source="title" dest="text_string" />
<field name="text_string" type="string" indexed="true" stored="true" multiValued="false" />
<field name="pk" type="long" indexed="true" stored="true" multiValued="false" />
</fields>
<!-- field to use to determine and enforce document uniqueness. -->
<uniqueKey>id</uniqueKey>
<!-- field for the QueryParser to use when an explicit fieldname is absent -->
<defaultSearchField>text</defaultSearchField>
<!-- SolrQueryParser configuration: defaultOperator="AND|OR" -->
<solrQueryParser defaultOperator="AND"/>
...
searche_indexes.py
class VideoIndex(indexes.SearchIndex, indexes.Indexable):
text = indexes.CharField(document=True, use_template=True)
pk = indexes.IntegerField(model_attr='pk')
title = indexes.CharField(model_attr='title', boost=1.125)
def index_queryset(self, using=None):
return Video.on_site.all()
def get_model(self):
return Video
答案 0 :(得分:0)
Faceting
是将搜索结果排列成类别(基于索引术语)。在每个类别中,Solr
报告相关术语的命中数,称为方面约束。通过分面,用户可以轻松浏览电影网站和产品评论网站等网站上的搜索结果,其中类别和类别中有许多项目。
这是一个很好的例子......
在您的情况下,您可能需要再次触发查询以获取ID和其他详细信息....