Question

在Project Deploy上，Lucene创建数据库表的索引，并且以下代码用于项目中相同的索引数十万条记录的

  public void startIndexing(int threadsToLoadObjects, int threadsForSubsequentFetch, int fetchSize, Class<?>[] types) {

    FullTextEntityManager fullTextEntityManager = Search.getFullTextEntityManager(eManager);
    long startTime = System.currentTimeMillis();

    try {
      log.info("Index Building Started..");
      if (ArrayUtils.isEmpty(types)) {
        log.info("Indexing all entities");
        fullTextEntityManager
          .createIndexer().batchSizeToLoadObjects(501)
          .threadsToLoadObjects(threadsToLoadObjects)
          .threadsForSubsequentFetching(threadsForSubsequentFetch)
          .progressMonitor(new SimpleIndexingProgressMonitor(5001))
          .idFetchSize(fetchSize)
          .cacheMode(CacheMode.IGNORE).startAndWait();
      } else {
        log.info("Indexing particular entities.");
        for (Class<?> type : types) {
          log.info("Indexing " + type.getCanonicalName());
          fullTextEntityManager
            .createIndexer(type).batchSizeToLoadObjects(501)
            .threadsToLoadObjects(threadsToLoadObjects)
            .threadsForSubsequentFetching(threadsForSubsequentFetch)
            .progressMonitor(new SimpleIndexingProgressMonitor(5001))
            .idFetchSize(fetchSize)
            .cacheMode(CacheMode.IGNORE).startAndWait();

          fullTextEntityManager.flushToIndexes();
        }
      }

      long completionTime = System.currentTimeMillis() - startTime;
      log.info("Completed. (Time " + completionTime + "ms)");
    } catch (Exception e) {
      log.error("Exception while rebuilding indexes", e);
    }
  }

对于类似的15行； Lucene仅创建五行索引，并跳过所有10行。（（不同之处在于，第一行中的整数值为1，另一行中的值为2，或者一行中的字符串为'Khap'，而另一行中的字符串为“ Mosh”（名称）。）

重新索引后；有时会向索引添加更多行，例如7行，有时甚至保留2行在索引中。（已在Lukeall中选中）

由于没有太多的lucene接触者，请问这可能是什么问题以及如何在最佳水平上对其进行调试？

Lucene在质量索引中跳过相似的数据库记录

0 个答案: