我正在为日志数据创建索引,每天大约有10万个文档被索引。
我发现通过kibana查看数据有点迟钝。
每天,以log-2019-04-11' is created, and I'm using
log *`之类的索引作为索引模式,以查看kibana中的数据。
我使用的是单节点,仅更改了Xms8g Xmm8g
用于ES配置。
**编辑**
我知道分片的概念,但是我没有接触(配置)任何与分片有关的内容(例如elasticsearch.yml中的内容)
GET /analytics-prod*/_stats/
{
"_shards" : {
"total" : 550,
"successful" : 275,
"failed" : 0
},
"_all" : {
"primaries" : {
"docs" : {
"count" : 5128749,
"deleted" : 0
},
"store" : {
"size_in_bytes" : 4396388356
},
一天的数据量约为。
GET /analytics-prod-2019.04.14/_stats/ [55/1894]
{
"_shards" : {
"total" : 10,
"successful" : 5,
"failed" : 0
},
"_all" : {
"primaries" : {
"docs" : {
"count" : 68912,
"deleted" : 0
},
"store" : {
"size_in_bytes" : 67331653
},
我有两个比较器,第一个比较快,但是如果可能的话,我想在较慢的一个上运行ES。
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 6
On-line CPU(s) list: 0-5
Thread(s) per core: 1
Core(s) per socket: 6
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 158
Model name: Intel(R) Core(TM) i5-9400F CPU @ 2.90GHz
Stepping: 10
CPU MHz: 800.041
CPU max MHz: 4100.0000
CPU min MHz: 800.0000
BogoMIPS: 5808.00
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 9216K
NUMA node0 CPU(s): 0-5
=== START OF INFORMATION SECTION ===
Device Model: ST2000DM006-2DM164
Serial Number: Z560A76X
LU WWN Device Id: 5 000c50 09286b712
Firmware Version: CC26
User Capacity: 2,000,398,934,016 bytes [2.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-2, ACS-3 T13/2161-D revision 3b
SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Mon Apr 15 19:01:32 2019 KST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
另一台计算机具有以下cpu / hdd
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 58
Model name: Intel(R) Core(TM) i5-3570 CPU @ 3.40GHz
Stepping: 9
CPU MHz: 1694.098
CPU max MHz: 3400.0000
CPU min MHz: 1600.0000
BogoMIPS: 6799.84
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 6144K
NUMA node0 CPU(s): 0-3
=== START OF INFORMATION SECTION ===
Model Family: Western Digital Blue
Device Model: WDC WD10EZEX-00RKKA0
Serial Number: WD-WMC1S5459395
LU WWN Device Id: 5 0014ee 0ae471bd6
Firmware Version: 80.00A80
User Capacity: 1,000,204,886,016 bytes [1.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS (minor revision not indicated)
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Mon Apr 15 19:02:20 2019 KST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
获取_cat / nodes?v
ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
192.168.0.57 72 88 4 0.60 0.72 0.61 mdi * 3tf0hMb
GET analytics-prod-2019.04.01 / _mapping
太大了,无法粘贴到这里,我保存了输出并打印了行数
$ cat a.json | wc -l
772
{
"analytics-prod-2019.04.01" : {
"mappings" : {
"doc" : {
"properties" : {
"@timestamp" : {
"type" : "date"
},
"@version" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"batch_no" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"data" : {
"properties" : {
"anonymousId" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
},
"channel" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"context" : {
"properties" : {
"app" : {
"properties" : {
"build" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"name" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"namespace" : {
"type" : "text",
"namespace" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"version" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
},
"device" : {
"properties" : {
"adTrackingEnabled" : {
"type" : "boolean"
},
"advertisingId" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"id" : {
"type" : "text",
"id" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"manufacturer" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"model" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"name" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
},
"library" : {
"properties" : {
"name" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"version" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
},
"locale" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"network" : {
..它一直都在粘贴,大约粘贴了整个输出的1/4。
答案 0 :(得分:0)
您的每日数据非常小(大约64MB)。这样小的索引不需要5个分片。分片的建议大小为10-40GB
,这意味着您正在过度分节点。当对具有5个分片的10个索引执行搜索请求时,您需要将总数设为50个分片(据我所知,您只有1个节点)。如果将其减少到1个分片,则只会减少到10个分片。
如果要对数据执行更多并行搜索请求,请尝试放置更多副本(如果您有6个节点,请尝试添加最多5个副本)。也许您可以尝试输入每月索引而不是每日索引,每天100k
个文档很小。
Java堆应该大约占RAM的50%(如果您有64GB
的RAM,请放Xms30g Xmm30g
,不要放多于此,因为Java指针-{{3} }。
此外,您需要正确设置映射或模板。 Elasticsearch为您完成了此任务,但我想您没有对大多数字段进行任何全文搜索分析,因此您可以将它们作为keyword
放在预定义的映射中-more info
最后,您应该使用SSD
而不是旋转磁盘来进行更多和更快的I / O操作。