为什么Hibernate MassIndexer说索引已完成但实际上还没有完成

时间:2019-06-20 03:22:03

标签: hibernate hibernate-search

我正在尝试使用MassIndexer在弹性搜索中对大数据(1350万条记录与7-8个表相关联)进行索引。它显示消息,它在39.08%之后索引了所有记录。我在本地和生产中遇到了同样的问题,每次执行的百分比都不同。

23:05:25,338 (Hibernate Search: Elasticsearch transport thread-2)  INFO SimpleIndexingProgressMonitor:90 - HSEARCH000031: Indexing speed: 1085.105591 documents/second; progress: 39.08%
23:05:25,339 (Hibernate Search: Elasticsearch transport thread-2)  INFO SimpleIndexingProgressMonitor:87 - HSEARCH000030: 5322450 documents indexed in 4904960 ms
23:05:25,339 (Hibernate Search: Elasticsearch transport thread-2)  INFO SimpleIndexingProgressMonitor:90 - HSEARCH000031: Indexing speed: 1085.115845 documents/second; progress: 39.08%
23:05:25,339 (Hibernate Search: Elasticsearch transport thread-2)  INFO SimpleIndexingProgressMonitor:87 - HSEARCH000030: 5322500 documents indexed in 4904961 ms
23:05:25,339 (Hibernate Search: Elasticsearch transport thread-2)  INFO SimpleIndexingProgressMonitor:90 - HSEARCH000031: Indexing speed: 1085.125854 documents/second; progress: 39.08%
23:05:36,103 (Hibernate Search: Elasticsearch transport thread-3) DEBUG request:194 - HSEARCH400082: Executed Elasticsearch HTTP POST request to path '/xyz/_forcemerge' with query parameters {} in 16734ms. Response had status 200 'OK'.
23:05:37,666 (Hibernate Search: Elasticsearch transport thread-3) DEBUG request:194 - HSEARCH400082: Executed Elasticsearch HTTP POST request to path '/xyz/_flush' with query parameters {} in 1562ms. Response had status 200 'OK'.
23:05:37,668 (Hibernate Search: Elasticsearch transport thread-3) DEBUG request:194 - HSEARCH400082: Executed Elasticsearch HTTP POST request to path '/xyz/_refresh' with query parameters {} in 1ms. Response had status 200 'OK'.
23:05:37,668 (main)  INFO SimpleIndexingProgressMonitor:78 - HSEARCH000028: Reindexed 13618954 entities

日志:

<Window x:Class="WpfApp1.MainWindow"
        xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
        xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
        xmlns:d="http://schemas.microsoft.com/expression/blend/2008"
        xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"
        xmlns:local="clr-namespace:WpfApp1"
        xmlns:commands="clr-namespace:WpfApp1.Commands"
        mc:Ignorable="d"
        Title="MainWindow" Height="450" Width="800">

  <Window.CommandBindings>
    <CommandBinding 
     Command="commands:WordSearchCommand"
     CanExecute="CanExecuteChanged"/>
  </Window.CommandBindings>

  <Grid>

    <StackPanel Orientation="Horizontal">
      <Label Target="{Binding ElementName=wordSearchBox}">Word _Search:</Label>
      <TextBox 
          Name="wordSearchBox" 
          Height="25" 
          Width="600" 
          VerticalAlignment="Top" 
          SpellCheck.IsEnabled="True"
          Text="{Binding Path=SearchWord}">
      </TextBox>
      <Button Height="25" Width="100" VerticalAlignment="Top" Command="{Binding Path=WordSearchCommand}" CommandParameter="{Binding Path=SearchWord}">Search</Button>
    </StackPanel>

  </Grid>

</Window>

只有在索引所有记录后,它才显示索引已完成。

2 个答案:

答案 0 :(得分:1)

Your log清楚地表明,在质量索引编制过程中存在错误,这在您的第一篇文章中没有提到。

您定期收到这样的错误:

10:48:28,125 (Hibernate Search: Elasticsearch transport thread-2) ERROR LogErrorHandler:71 - HSEARCH000058: Exception occurred org.hibernate.search.exception.SearchException: HSEARCH400007: Elasticsearch request failed.
Request: POST /_bulk with parameters {refresh=false}
Response: null
Subsequent failures:
    Entity com.example.model.XXXXXX  Id 855665929073643520  Work Type  org.hibernate.search.backend.AddLuceneWork

org.hibernate.search.exception.SearchException: HSEARCH400007: Elasticsearch request failed.
Request: POST /_bulk with parameters {refresh=false}
Response: null
    at org.hibernate.search.elasticsearch.work.impl.BulkWork.lambda$execute$1(BulkWork.java:77)
    at org.hibernate.search.util.impl.Futures.lambda$handler$1(Futures.java:57)
    at java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870)
    at java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:852)
    at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
    at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
    at org.hibernate.search.elasticsearch.client.impl.DefaultElasticsearchClient$1.onFailure(DefaultElasticsearchClient.java:123)
    at org.elasticsearch.client.RestClient$FailureTrackingResponseListener.onDefinitiveFailure(RestClient.java:605)
    at org.elasticsearch.client.RestClient$1.retryIfPossible(RestClient.java:396)
    at org.elasticsearch.client.RestClient$1.failed(RestClient.java:375)
    at org.apache.http.concurrent.BasicFuture.failed(BasicFuture.java:134)
    at org.apache.http.impl.nio.client.AbstractClientExchangeHandler.failed(AbstractClientExchangeHandler.java:419)
    at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.timeout(HttpAsyncRequestExecutor.java:375)
    at org.apache.http.impl.nio.client.InternalIODispatch.onTimeout(InternalIODispatch.java:92)
    at org.apache.http.impl.nio.client.InternalIODispatch.onTimeout(InternalIODispatch.java:39)
    at org.apache.http.impl.nio.reactor.AbstractIODispatch.timeout(AbstractIODispatch.java:175)
    at org.apache.http.impl.nio.reactor.BaseIOReactor.sessionTimedOut(BaseIOReactor.java:263)
    at org.apache.http.impl.nio.reactor.AbstractIOReactor.timeoutCheck(AbstractIOReactor.java:492)
    at org.apache.http.impl.nio.reactor.BaseIOReactor.validate(BaseIOReactor.java:213)
    at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:280)
    at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
    at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:588)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.SocketTimeoutException
    ... 11 more

从本质上讲,这意味着某些索引请求由于Elasticsearch花费的时间太长而失败。

可能有很多原因。

您的Hibernate Search配置看起来非常保守(只有两个线程),因此我认为您不会对Elasticsearch集群施加太大压力。

我建议仔细检查您的Elasticsearch设置(Elasticsearch文档可能提供了一些可以做的或不可以做的事情)。 检查您是否具有适当大小的Elasticsearch集群,并且服务器的大小是否适当,...

您可能还需要调整与Elasticsearch集群通信相关的hibernate.search配置属性:超时,连接数,...请参见https://docs.jboss.org/hibernate/search/5.11/reference/en-US/html_single/#elasticsearch-integration-configuration

答案 1 :(得分:0)

这看起来很像HSEARCH-3462,该问题已在6.0.0.Alpha2中修复,但并未反向移植到5.11。

长话短说:这是一个日志记录问题,而不是索引问题。最后一行声明所有内容都已重新编制索引,这是您应该相信的。

我将看到我们是否可以轻松地将修补程序反向移植到5.10 / 5.11,但是可能需要一些时间才能再次释放这些分支。返程票(如果需要跟踪进度):https://hibernate.atlassian.net/browse/HSEARCH-3622