无法使用Scala在Lucene 6.2中搜索

时间:2016-09-11 03:30:36

标签: mysql scala lucene slick

我正在尝试使用Lucene 6.2索引来自MySQL的数据(在Scala中使用Slick)。这是下面的代码

package oc.api.services

/**
  * Created by sujit on 9/7/16.
  */
import org.apache.lucene.document._
import org.apache.lucene.analysis.standard.StandardAnalyzer
import org.apache.lucene.index._
import org.apache.lucene.search.IndexSearcher
import java.io.{File, IOException}
import java.nio.file.Paths

import akka.actor.ActorSystem
import akka.event.{Logging, LoggingAdapter}
import akka.stream.ActorMaterializer
import oc.api.utils.{Config, DatabaseService}
import org.apache.lucene.analysis.core.KeywordAnalyzer
import org.apache.lucene.index.IndexWriterConfig.OpenMode
import org.apache.lucene.queryparser.classic.{QueryParser}
import org.apache.lucene.store.FSDirectory

import scala.concurrent.ExecutionContext

class Indexer extends Config {
  implicit val actorSystem = ActorSystem()
  implicit val executor: ExecutionContext = actorSystem.dispatcher
  implicit val log: LoggingAdapter = Logging(actorSystem, getClass)
  implicit val materializer: ActorMaterializer = ActorMaterializer()

  val databaseService = new DatabaseService(jdbcUrl, dbUser, dbPassword)

  val notesService = new NotesService(databaseService)    

  def setIndex = {
    val IndexStoreDir = Paths.get("/var/www/html/LuceneIndex")
    val analyzer = new KeywordAnalyzer()
    val writerConfig = new IndexWriterConfig(analyzer)
    writerConfig.setOpenMode(OpenMode.CREATE)
    writerConfig.setRAMBufferSizeMB(500)
    val directory = FSDirectory.open(IndexStoreDir)
    var writer = new IndexWriter(directory, writerConfig)
    val notes = notesService.getNotes() //Gets all notes from slick. Data is coming in getNotes()
    var doc = new Document()
    var count = 0

    val stringType = new FieldType()
    notes.map(_.foreach{
      case(note) =>
        doc = new Document()

        var field = new TextField("title", note.title, Field.Store.YES)
        doc.add(field)

        field = new TextField("teaser", note.teaser, Field.Store.YES)
        doc.add(field)

        field = new TextField("description", note.description, Field.Store.YES)
        doc.add(field)

        writer.addDocument(doc)
    })
    writer.commit()
  }

  def search(keyword: String) = {
    val IndexStoreDir = Paths.get("/var/www/html/LuceneIndex")
    var directoryReader = DirectoryReader.open(FSDirectory.open(IndexStoreDir))
    val analyzer = new StandardAnalyzer()

    val searcher = new IndexSearcher(directoryReader)

    val mqp = new QueryParser("title", analyzer) //MultiFieldQueryParser(filesToSearch,analyzer)
    val query = mqp.parse(keyword)

    val hits = searcher.search(query,10)
    val scoreDoc = hits.scoreDocs
    println(scoreDoc.length)
  }

}

object Indexer extends App {
  val index = new Indexer
  index.setIndex
  index.search("Donec")
}

setIndex函数在提供的Path中按预期工作。但是当我基于关键字搜索索引时,它会抛出0结果。搜索功能有什么错误吗?怎么解决这个问题?

1 个答案:

答案 0 :(得分:2)

这里的主要原因可能是您的分析仪不匹配。您使用KeywordAnalyzer进行索引,根本不进行分析。对于搜索,您使用StandardAnalyzer。在您的示例中,查询"Donec"将被解析并分析到title:donec,就像您使用new TermQuery(new Term("title", "donec"))一样。这只会匹配具有确切标题donec的文档,因为您在索引时使用了关键字分析器。您应该尝试使用相同的分析器进行索引。

另一件事可能是 - 我只能猜测 - notesService.getNotes()可能是Future[_](或类似的异步类型),因为它涉及光滑。如果是,则将调用中的所有文档添加到.map(),计划在将来解决后发生。然而,writer.commit()调用发生在调用线程中,可能在您添加所有文档之前,因此您应该将提交移动到map回调中。