Question

我正在使用嵌入式弹性搜索服务器的解决方案 - 在一台本地计算机上。场景是：

1）创建一个节点的集群。导入数据 - 约180个索引和911个分片中的300万条记录。数据可用，搜索工作并返回预期数据，健康状况似乎很好：

{
  "cluster_name" : "cn1441023806894",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 1,
  "number_of_data_nodes" : 1,
  "active_primary_shards" : 911,
  "active_shards" : 911,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0
}

2）现在，我关闭服务器 - 这是我的控制台输出：

sie 31, 2015 2:51:36 PM org.elasticsearch.node.internal.InternalNode stop
INFO: [testbg] stopping ...
sie 31, 2015 2:51:50 PM org.elasticsearch.node.internal.InternalNode stop
INFO: [testbg] stopped
sie 31, 2015 2:51:50 PM org.elasticsearch.node.internal.InternalNode close
INFO: [testbg] closing ...
sie 31, 2015 2:51:50 PM org.elasticsearch.node.internal.InternalNode close
INFO: [testbg] closed

数据库文件夹大约是2.4 GB。

3）现在我再次启动服务器......大约需要10分钟才能达到绿色状态，例如健康状况：

{
  "cluster_name" : "cn1441023806894",
  "status" : "red",
  "timed_out" : false,
  "number_of_nodes" : 1,
  "number_of_data_nodes" : 1,
  "active_primary_shards" : 68,
  "active_shards" : 68,
  "relocating_shards" : 0,
  "initializing_shards" : 25,
  "unassigned_shards" : 818
}

在该过程之后，数据库文件夹大约为0.8 GB。

然后我关闭数据库，再次打开它，现在它在10秒内变为绿色。所有下一次关闭/开始操作都非常快。

我的配置：

settings.put(SET_NODE_NAME, projectNameLC);
settings.put(SET_PATH_DATA, projectLocation + "\\" + CommonConstants.ANALYZER_DB_FOLDER); 
settings.put(SET_CLUSTER_NAME, clusterName);
settings.put(SET_NODE_DATA, true);
settings.put(SET_NODE_LOCAL, true);
settings.put(SET_INDEX_REFRESH_INTERVAL, "-1");
settings.put(SET_INDEX_MERGE_ASYNC, true);
//the following settings are my attempt to speed up loading on the 2nd startup
settings.put("cluster.routing.allocation.disk.threshold_enabled", false);
settings.put("index.number_of_replicas", 0);
settings.put("cluster.routing.allocation.disk.include_relocations", false);
settings.put("cluster.routing.allocation.node_initial_primaries_recoveries", 25);
settings.put("cluster.routing.allocation.node_concurrent_recoveries", 8);
settings.put("indices.recovery.concurrent_streams", 6);
settings.put("indices.recovery.concurrent_streams", 6);
settings.put("indices.recovery.concurrent_small_file_streams", 4);

问题：

1）第二次启动时会发生什么？ db文件夹大小从2.4 GB减少到800 MB。

2）如果这个过程是必要的，可以手动操作，所以我可以表现得很好，请等待＆＃34;对话框？

第二次打开数据库的用户体验非常糟糕，我需要更改它。

干杯马尔钦

Answer 1

在另一个论坛上 - https://discuss.elastic.co/t/initializing-shards-second-db-start-up-takes-long-time/28357 - 我得到了Mike Simos的回答。解决方案是在完成向其添加数据后在索引上调用synced flush：

../auth_alpha.html

它完成了诀窍：现在我的数据库在30秒而不是10分钟内启动，刷新数据的时间被移动到我的业务逻辑的导入部分，这是可以接受的。我也注意到对导入的时间影响不是很大。

嵌入式弹性搜索 - 第二次启动需要很长时间

1 个答案: