Question

运行ES 1.5.2 JAVA 1.8_45 Windows 2008 32个核心128gb RAM 5TB SSD的4个节点（每台机器）。

我的目标是索引约25亿份文件。我高达8.1亿。每个文档平均30k。

我目前有ES_HEAP_SIZE = 30g

但我经历了很多内存压力和STW暂停。示例：当前一个节点的堆使用率始终高于90％，而其余节点在30％到40％之间。所以似乎1节点不会GC ???

在群集批量索引（无错误）和一些滚动搜索上只发生了2件事。

尽可能使用doc值。目前没有字段数据缓存（marvel verry small除外），并且过滤器缓存非常小，每个节点大约100MB。

节点仍在尝试恢复，所以我只是不想完全停止群集并将RAM重置为10GB ??

How I connect to the cluster in both bulk and scroll search...

// Do this once at application startup and re-use the client instance.
Settings settings = ImmutableSettings
    .settingsBuilder()
    .put("cluster.name", "xxxx")
    .build();

    client = new TransportClient(settings)
        .addTransportAddress(new InetSocketTransportAddress("xxxx", 9300))
        .addTransportAddress(new InetSocketTransportAddress("xxxx", 9300))
        .addTransportAddress(new InetSocketTransportAddress("xxxx", 9300))
        .addTransportAddress(new InetSocketTransportAddress("xxxx", 9300));

Answer 1

不要将批量请求仅发送到一个节点。搜索请求也是如此。

批量请求保存在接收请求的节点上的内存缓冲区中，显然，将任何类型的请求发送到一个节点并不是一个好主意。通过使用代理服务器（如果有的话）或使用client node循环请求并将请求发送到该节点。客户端节点知道如何执行循环机制。

您还可以查看其他选项（取决于访问群集的客户端），并查看这些客户端是否支持自动循环/负载平衡请求。

我们什么时候需要Elasticsearch的大堆？

1 个答案: