详细说明:
logstash进程:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
systemd+ 15395 457 5.3 6953124 424628 ? SNl 08:57 63:19 /usr/bin/java -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -Djava.awt.headless=true -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError -Djava.io.tmpdir=/opt/logstash -Xmx2g -Xss2048k -Djffi.boot.library.path=/opt/logstash/vendor/jruby/lib/jni -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -Djava.awt.headless=true -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError -Djava.io.tmpdir=/opt/logstash -XX:HeapDumpPath=/opt/logstash/heapdump.hprof -Xbootclasspath/a:/opt/logstash/vendor/jruby/lib/jruby.jar -classpath : -Djruby.home=/opt/logstash/vendor/jruby -Djruby.lib=/opt/logstash/vendor/jruby/lib -Djruby.script=jruby -Djruby.shell=/bin/sh org.jruby.Main --1.9 /opt/logstash/lib/bootstrap/environment.rb logstash/runner.rb agent -f /etc/logstash/conf.d -l /var/log/logstash/logstash.log --auto-reload
Logstash子进程(top -Hp)
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
4637 systemd+ 39 19 7045796 2,201g 5728 R 97,7 28,8 292:53.17 [main]>worker7
4632 systemd+ 39 19 7045796 2,201g 5728 R 97,3 28,8 140:04.15 [main]>worker2
4635 systemd+ 39 19 7045796 2,201g 5728 R 97,3 28,8 223:43.32 [main]>worker5
4630 systemd+ 39 19 7045796 2,201g 5728 R 97,0 28,8 281:26.59 [main]>worker0
4631 systemd+ 39 19 7045796 2,201g 5728 R 96,3 28,8 160:18.87 [main]>worker1
4636 systemd+ 39 19 7045796 2,201g 5728 R 96,3 28,8 134:52.64 [main]>worker6
4517 systemd+ 39 19 7045796 2,201g 5728 R 79,7 28,8 183:59.43 LogStash::Runne
4493 systemd+ 39 19 7045796 2,201g 5728 S 1,7 28,8 33:23.33 java
4495 systemd+ 39 19 7045796 2,201g 5728 S 1,7 28,8 33:18.66 java
4497 systemd+ 39 19 7045796 2,201g 5728 S 1,7 28,8 33:22.12 java
4498 systemd+ 39 19 7045796 2,201g 5728 S 1,7 28,8 33:21.07 java
4499 systemd+ 39 19 7045796 2,201g 5728 S 1,7 28,8 33:22.47 java
4494 systemd+ 39 19 7045796 2,201g 5728 S 1,3 28,8 33:23.83 java
4496 systemd+ 39 19 7045796 2,201g 5728 S 0,7 28,8 33:22.12 java
4500 systemd+ 39 19 7045796 2,201g 5728 S 0,7 28,8 33:22.44 java
4633 systemd+ 39 19 7045796 2,201g 5728 S 0,7 28,8 3:22.78 [main]>worker3
20526 systemd+ 39 19 7045796 2,201g 5728 S 0,7 28,8 0:15.27 [main]<kafka
4492 systemd+ 39 19 7045796 2,201g 5728 S 0,3 28,8 0:23.78 LogStash::Runne
4532 systemd+ 39 19 7045796 2,201g 5728 S 0,3 28,8 0:55.25 [main]<kafka
4536 systemd+ 39 19 7045796 2,201g 5728 S 0,3 28,8 1:29.43 [main]<kafka
4606 systemd+ 39 19 7045796 2,201g 5728 S 0,3 28,8 0:03.88 [main]<kafka
4638 systemd+ 39 19 7045796 2,201g 5728 S 0,3 28,8 0:48.34 pipeline.main
20529 systemd+ 39 19 7045796 2,201g 5728 S 0,3 28,8 0:14.41 [main]<kafka
20533 systemd+ 39 19 7045796 2,201g 5728 S 0,3 28,8 0:15.62 [main]<kafka
Logstash配置
像这样的几个街区:kafka {
zk_connect => "pbdcnt1:2181,pbdcnt2:2181,pbdcnt3:2181"
topic_id => "transactions"
tags => ["transactions"]
group_id => "prod_elastic_fraud"
codec => plain {
charset => "Windows-1250"
}
}
有几个像这样:
filter {
if "transactions" in [tags] {
grok{
match=>["message","(?<log_time>.*?)? ?(?<src>.*?)? ?(?<insert_time>.*?);(?<col1>.*?);(?<col2>.*?);...;(?<colN>.*?)"]
}
}
输出是这样的:
output {
if [tags] {
elasticsearch {
hosts => ["localhost"]
index => "%{tags}-%{+YYYY.MM.dd}"
}
}
}
某处我的错? 如果需要更多数据,请询问。
谢谢, 埃迪
更新:
在某处泄漏?垃圾收集时间不断增加(jstat)
S0 S1 E O P YGC YGCT FGC FGCT GCT LGCC GCC
64.69 0.00 38.41 13.47 59.96 57906 2977.233 4916 206.461 3183.694 Allocation Failure No GC
0.00 51.04 27.61 13.48 59.96 57907 2977.291 4916 206.461 3183.753 Allocation Failure No GC
38.59 51.04 100.00 13.48 59.96 57908 2977.291 4916 206.461 3183.753 Allocation Failure Allocation Failure
57.04 0.00 77.47 13.48 59.96 57908 2977.342 4916 206.461 3183.803 Allocation Failure No GC
0.00 72.64 81.57 13.48 59.96 57909 2977.399 4916 206.461 3183.860 Allocation Failure No GC
61.65 0.00 34.59 13.48 59.96 57910 2977.448 4916 206.461 3183.909 Allocation Failure No GC
0.00 45.05 8.99 13.48 59.96 57911 2977.492 4916 206.461 3183.953 Allocation Failure No GC
48.47 0.00 7.30 13.49 59.96 57912 2977.539 4916 206.461 3184.000 Allocation Failure No GC
48.47 0.00 72.43 13.49 59.96 57912 2977.539 4916 206.461 3184.000 Allocation Failure No GC
0.00 53.80 43.27 13.49 59.96 57913 2977.588 4916 206.461 3184.050 Allocation Failure No GC
59.96 0.00 49.70 13.49 59.96 57914 2977.636 4916 206.461 3184.098 Allocation Failure No GC
59.96 0.00 95.84 13.49 59.96 57914 2977.636 4916 206.461 3184.098 Allocation Failure No GC
0.00 54.05 84.24 13.49 59.96 57915 2977.683 4916 206.461 3184.145 Allocation Failure No GC
65.37 0.00 84.28 13.49 59.96 57916 2977.743 4916 206.461 3184.205 Allocation Failure No GC
0.00 48.62 48.73 13.49 59.96 57917 2977.798 4916 206.461 3184.259 Allocation Failure No GC
53.00 0.00 24.35 13.50 59.96 57918 2977.845 4916 206.461 3184.307 Allocation Failure No GC
0.00 52.87 19.08 13.50 59.96 57919 2977.900 4916 206.461 3184.361 Allocation Failure No GC
0.00 52.87 84.52 13.50 59.96 57919 2977.900 4916 206.461 3184.361 Allocation Failure No GC
72.04 0.00 60.68 13.50 59.96 57920 2977.954 4916 206.461 3184.415 Allocation Failure No GC
0.00 65.02 64.85 13.51 59.96 57921 2978.009 4916 206.461 3184.471 Allocation Failure No GC
58.54 0.00 40.52 13.51 59.96 57922 2978.058 4916 206.461 3184.520 Allocation Failure No GC
58.54 0.00 98.01 13.51 59.96 57922 2978.058 4916 206.461 3184.520 Allocation Failure No GC
0.00 58.06 99.74 13.51 59.96 57924 2978.107 4916 206.461 3184.569 Allocation Failure Allocation Failure
69.63 0.00 81.55 13.51 59.96 57924 2978.163 4916 206.461 3184.624 Allocation Failure No GC
0.00 56.36 38.21 13.51 59.96 57925 2978.211 4916 206.461 3184.672 Allocation Failure No GC
52.32 0.00 46.66 13.51 59.96 57926 2978.260 4916 206.461 3184.721 Allocation Failure No GC
0.00 67.84 18.10 13.51 59.96 57927 2978.312 4916 206.461 3184.774 Allocation Failure No GC
进程的jstack: https://www.dropbox.com/s/u6tz0w4v0in8lev/logstash.jstack?dl=0
答案 0 :(得分:0)
可悲的是,正是消耗太多CPU的正则表达式。当我删除正则表达式并使用带有列选项的CSV过滤器时(这基本上就是我想要做的事情),CPU现在一直在150%左右(对于200次事务来说似乎也太高了)。但是,嘿,这就是目前的情况。
什么保持迷茫是为什么它有效一段时间然后停止。也许会增加工人......