Django Haystack索引多个模型会返回奇怪的数据

时间:2013-02-14 15:52:23

标签: django django-haystack

我有一个奇怪的问题,我正试图追查。如果我索引多个模型,那么从我的Haystack索引返回的结果会比模型中的结果更多。

首先测试,我有以下Django模型定义

class Designator(models.Model):
    created_date = models.DateTimeField(default=datetime.now(), blank=False, editable=False)
    modified_date = AutoDateTimeField(blank=True, editable=False)
    created_by = models.ForeignKey(User)
    lastmodified_by = models.ForeignKey(User, blank=True, null=True, related_name="%(app_label)s_%(class)s_related")
    number = models.IntegerField(unique=True)
    description = models.CharField(max_length=50)

这个模型正由Haystack通过这个类

编制索引
class DesignatorIndex(indexes.SearchIndex, indexes.Indexable):
    text = indexes.CharField(document=True, use_template=True)
    number = indexes.CharField(model_attr='number')
    description = indexes.CharField(model_attr='description')

    def get_model(self):
        return Designator

    def index_queryset(self, using=None):
        """Used when the entire index for model is updated."""
        return self.get_model().objects.filter(modified_date__lte=datetime.now()).order_by('number')

以下是我在构建索引后从Django shell返回的结果。

>>> from haystack.query import SearchQuerySet
>>> from designator.models import Designator
>>> sqs = SearchQuerySet().models(Designator).filter(text='computer')
>>> sqs.count()
5
>>> for idx, s in enumerate(sqs):
...     print '%s - %s' % (idx, s.text.replace('\n', ' '))
...
0 - 8 COMPUTER MONITOR
1 - 9 COMPUTER PRINTER
2 - 10 COMPUTER CPU
3 - 38 COMPUTER KEYBOARDS
4 - 40 COMPUTER-MISC
>>> d = Designator.objects.filter(description__contains='computer')
>>> d.count()
5
>>> for a in d: print '%s - %s' % (a.number, a.description)
...
8 - COMPUTER MONITOR
9 - COMPUTER PRINTER
10 - COMPUTER CPU
38 - COMPUTER KEYBOARDS
40 - COMPUTER-MISC
>>>

这些结果似乎是正确的。索引在模型中返回相同的数据。

因此,我添加了另一个要编制索引的模型。我现在也完全重建了所有索引。它看起来像以下

class Vendor(models.Model):
    created_date = models.DateTimeField(default=datetime.now(), blank=False, editable=False)
    modified_date = AutoDateTimeField(blank=True, editable=False)
    created_by = models.ForeignKey(User)
    lastmodified_by = models.ForeignKey(User, blank=True, null=True, related_name="%(app_label)s_%(class)s_related")
    name = models.CharField(max_length=70)
    street_address = models.CharField(max_length=70, blank=True)
    city = models.CharField(max_length=50, blank=True)
    state = USStateField(blank=True)
    zip = models.CharField(max_length=5, blank=True)
    phone = PhoneNumberField(blank=True)
    email = models.EmailField(blank=True)

class VendorIndex(indexes.SearchIndex, indexes.Indexable):
    text = indexes.CharField(document=True, use_template=True)
    name = indexes.CharField(model_attr='name')

    def get_model(self):
        return Vendor

    def index_queryset(self, using=None):
        """Used when the entire index for model is updated."""
        return self.get_model().objects.filter(modified_date__lte=datetime.now()).order_by('name')

现在我的索引结果变得有点疯狂了。

>>> from haystack.query import SearchQuerySet
>>> from designator.models import Designator
>>> from vendor.models import Vendor
>>> sqs = SearchQuerySet().models(Designator).filter(text='computer')
>>> sqs.count()
5
>>> for idx, s in enumerate(sqs):
...     print '%s - %s' % (idx, s.text.replace('\n', ' '))
...
0 - 8 COMPUTER MONITOR
1 - 9 COMPUTER PRINTER
2 - 8 COMPUTER MONITOR
3 - 9 COMPUTER PRINTER
4 - 8 COMPUTER MONITOR
5 - 9 COMPUTER PRINTER
>>> d = Designator.objects.filter(description__contains='computer')
>>> d.count()
5
>>> for a in d: print '%s - %s' % (a.number, a.description)
...
8 - COMPUTER MONITOR
9 - COMPUTER PRINTER
10 - COMPUTER CPU
38 - COMPUTER KEYBOARDS
40 - COMPUTER-MISC
>>> sqs = SearchQuerySet().models(Vendor).filter(text='computer')
>>> sqs.count()
35
>>> v = Vendor.objects.filter(name__contains='computer')
>>> v.count()
36
>>>

好吧,那令人困惑。这里有什么可能的问题?我的索引设置有误吗?我的数据中的某些内容可能导致问题吗?不太确定我在这里缺少什么。

很抱歉这篇长篇文章,不知道如何压缩这个问题。

感谢您的帮助, -Jay

0 个答案:

没有答案