Lucene.net - 如何搜索备用索引键或备用词组合?

时间:2010-12-08 05:10:22

标签: asp.net language-agnostic lucene.net

在我的Lucene索引中,我有以下键

  

ID
  全文
  用户
  日期

使用以下方法,我的fullText搜索工作非常好。

Public Function ReadIndex(ByVal q As String, ByVal page As Integer?) As Domain.Pocos.LuceneResults Implements ILuceneService.ReadIndex
    ''# A timer variable to determine now long the method executes for
    Dim tStart As DateTime = DateTime.Now

    ''# Creates a container that we use to store all of the result ID's
    Dim IDList As List(Of Integer) = New List(Of Integer)

    ''# First we set the initial page number. 
    ''# If it's null, it means it's zero
    If page Is Nothing Then page = 0

    ''# [i] is the variable we use to extract the appropriate (needed)
    ''# documents from the results. Its initial value is the page number
    ''# multiplied by the number of results we want to return (in our
    ''# case 10). The [last] variable is used to stop the while loop at
    ''# the 10th record by simply adding 9 to the [i] variable.
    Dim i = page * 10
    Dim last As Integer = i + 9

    ''# Variables used by Lucene
    Dim reader As IndexReader = IndexReader.Open(luceneDirectory)
    Dim searcher As IndexSearcher = New IndexSearcher(reader)
    Dim query As Query = New TermQuery(New Term("fullText", q.ToLower))

    ''# We're using 10,000 as the maximum number of results to return
    ''# because I have a feeling that we'll never reach that full amount
    ''# anyways.  And if we do, who in their right mind is going to page
    ''# through all of the results?
    Dim topDocs As TopDocs = searcher.Search(query, Nothing, 10000)
    Dim doc As Document = Nothing

    ''# loop through the topDocs and grab the appropriate 10 results based
    ''# on the submitted page number
    While i <= last AndAlso i < topDocs.totalHits
        doc = searcher.Doc(topDocs.scoreDocs(i).doc)
        IDList.Add(doc.[Get]("id"))
        i += 1
    End While

    ''# Self explanitory
    searcher.Close()
    Dim EventList As List(Of Domain.Event) = EventService.QueryEvents().Where(Function(e) (IDList.Contains(e.ID))).ToList()

    Dim tStop As DateTime = DateTime.Now
    Dim LucienResults As New Domain.Pocos.LuceneResults With {.EventList = EventList,
                                                              .ExecuteTime = (tStop - tStart),
                                                              .TotalResults = topDocs.totalHits}

    Return LucienResults
End Function

现在我遇到的一个问题就是弄清楚如何在方法中添加用户和日期搜索。

基本上,如果我搜索“某个事件”,结果会完美显示。但是,如果我搜索user:joedate:12/07/2100,我就不会得到任何结果。

此外,如果我有短语the quick brown fox jumped over the lazy dogs,并且我搜索brown fox,我获取索引结果,但是如果我搜索quick fox ,我不会获得结果。基本上我想在所有空格上拆分字符串并单独搜索每个单词。

我需要在此方法中添加什么才能启用特定键和备用字组合的搜索?

1 个答案:

答案 0 :(得分:1)

你基本上是在寻找“棕色狐狸”和“快速狐狸”作为一个单一的标记。您可能希望拆分空格并构建包含多个TermQuery字段的BooleanQuery,或者只是将字符串抛出QueryParser。

您描述的语法“user:joe”是默认的QueryParser将解析为新的TermQuery(新的Term(“user”,“joe”)),这是您想要的。您当前的解决方案将搜索单个“user:joe”令牌,大多数分析器将分成两个令牌,因此您永远不会与这些分析器匹配。

另外,你不能告诉你的IndexSearcher.Search要停在你要读的最后一个索引,而不是10000?

在此期间,如果您只对一个字段感兴趣,请不要使用IndexSearcher.Doc读取文档实例。使用FieldCache,它将保留内存缓存(通过弱引用的索引段读取器),这将允许您快速查找单个字段。

最后,看看你正在使用哪种分析仪。有些是特定于其他语言的,有些是同义词或词干支持等。[通常]使搜索更容易使用的东西。