Rasperry PI 3+上的InfluxDB 1.7.x高CPU使用率

时间:2019-07-07 20:31:54

标签: raspberry-pi3 grafana influxdb

我已经设置好Raspbery PI 3+以运行Grafana(与InfluxDB和Telegraf一起使用)来收集家庭网络的网络统计信息。我从Sonic Wall,一个“智能管理的” HP交换机和两个Cisco交换机读取数据。有一些关于ping时间和数据包丢失的指标,并且该计算机还托管了我的Unifi接入点管理器。

这已经工作了大约6个月了。在过去的几天里,InfluxDB病了。尝试查询InfluxDB时,Grafana开始显示501错误。我重新启动了Pi,它又回来了……但是大约12小时后,我又陷入了501s的困境。

我看到InfluxDB固定了CPU。从来没有过高的CPU使用率,但是现在我一直在200%到250%之间。令人费解的是,(据我所知)没有理由改变数据库上的查询负载。

我认为,当我升级到InfluxDB 1.7.7时,情况会变得更糟,但是我不知道以前的版本是什么。此外,我很难收集来自InfluxDB的任何信息,因为它一开始就固定CPU使用率,并且主机变得无响应。

如何诊断InfluxDB的CPU使用率高?

这里htop显示使用超过350%的CPU涌入:


-------------------------------------------------------------------------------
  2019-07-07 13:25:02
-------------------------------------------------------------------------------


  1  [||||||||||||||||||||||||                                                             25.5%]   Tasks: 36, 147 thr; 6 running
  2  [||||||||||||||||||||||||||||                                                         29.5%]   Load average: 3.43 3.84 3.78
  3  [||||||||||||||||||||||||                                                             25.6%]   Uptime: 00:47:19
  4  [||||||||||||||||||||||||||||||||||||||||||||||||||                                   54.7%]
  Mem[||||||||||||||||||||||||||||||||||                                               136M/926M]
  Swp[||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||99.9M/100.0M]

  PID USER      PRI  NI  VIRT   RES   SHR S CPU% MEM%   TIME+  Command
 4306 influxdb   20   0 1019M 48344 30068 R 121.  5.1  0:07.04 /usr/bin/influxd -config /etc/influxdb/influxdb.conf
 4310 influxdb   20   0 1019M 48344 30068 S 16.4  5.1  0:00.43 /usr/bin/influxd -config /etc/influxdb/influxdb.conf
 4309 influxdb   20   0 1019M 48344 30068 S 11.8  5.1  0:00.34 /usr/bin/influxd -config /etc/influxdb/influxdb.conf
 4311 influxdb   20   0 1019M 48344 30068 S  7.2  5.1  0:00.37 /usr/bin/influxd -config /etc/influxdb/influxdb.conf
  559 telegraf   20   0  832M 18420  7440 S  2.6  1.9  3:08.39 /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d
 4270 pi         20   0  6372  3060  2072 R  2.6  0.3  0:01.06 htop
  116 root       20   0 29168  3012  2780 S  2.6  0.3  0:41.88 /lib/systemd/systemd-journald
 4307 influxdb   20   0 1019M 48344 30068 S  2.0  5.1  0:00.04 /usr/bin/influxd -config /etc/influxdb/influxdb.conf
 4312 influxdb   20   0 1019M 48344 30068 S  1.3  5.1  0:00.24 /usr/bin/influxd -config /etc/influxdb/influxdb.conf
 1066 telegraf   20   0  832M 18420  7440 R  1.3  1.9  0:09.25 /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d
 1057 telegraf   20   0  832M 18420  7440 S  0.7  1.9  0:11.60 /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d
  340 mongodb    20   0  232M  2492  1760 S  0.7  0.3  0:35.16 /usr/bin/mongod --config /etc/mongodb.conf
 1234 telegraf   20   0  832M 18420  7440 S  0.7  1.9  0:07.61 /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d
 1239 telegraf   20   0  832M 18420  7440 S  0.7  1.9  0:08.03 /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d
  451 mongodb    20   0  232M  2492  1760 S  0.7  0.3  0:14.52 /usr/bin/mongod --config /etc/mongodb.conf
  345 root       20   0 23756  1036   556 S  0.7  0.1  0:11.47 /usr/sbin/rsyslogd -n
  381 root       20   0 23756  1036   556 S  0.7  0.1  0:05.25 /usr/sbin/rsyslogd -n
  659 unifi      20   0 1112M 20080  1832 S  0.7  2.1  0:15.78 unifi -cwd /usr/lib/unifi -home /usr/lib/jvm/jdk-8-oracle-arm32-vfp-hflt/jre -cp /usr/share/java/commons-daemon.jar:/usr/lib/unifi/lib/ac
  445 mongodb    20   0  232M  2492  1760 S  0.7  0.3  0:05.27 /usr/bin/mongod --config /etc/mongodb.conf
  721 www-data   20   0  224M   384   332 S  0.7  0.0  0:01.90 /usr/sbin/apache2 -k start
  684 www-data   20   0  224M   384   332 S  0.7  0.0  0:01.90 /usr/sbin/apache2 -k start
  756 unifi      20   0 1112M 20080  1832 S  0.7  2.1  0:02.29 unifi -cwd /usr/lib/unifi -home /usr/lib/jvm/jdk-8-oracle-arm32-vfp-hflt/jre -cp /usr/share/java/commons-daemon.jar:/usr/lib/unifi/lib/ac
  765 grafana    20   0  924M 13820  3420 S  0.7  1.5  0:00.45 /usr/sbin/grafana-server --config=/etc/grafana/grafana.ini --pidfile=/var/run/grafana/grafana-server.pid cfg:default.paths.logs=/var/log/
  671 telegraf   20   0  832M 18420  7440 S  0.0  1.9  0:11.24 /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d
 3627 telegraf   20   0  832M 18420  7440 S  0.0  1.9  0:01.78 /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d
  740 telegraf   20   0  832M 18420  7440 S  0.0  1.9  0:07.68 /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d
  663 telegraf   20   0  832M 18420  7440 S  0.0  1.9  0:20.88 /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d
 1081 telegraf   20   0  832M 18420  7440 S  0.0  1.9  0:14.85 /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d
 1248 telegraf   20   0  832M 18420  7440 S  0.0  1.9  0:12.42 /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d
  666 root       20   0  916M  5004  1464 S  0.0  0.5  0:00.35 /usr/bin/containerd
 4181 grafana    20   0  924M 13820  3420 S  0.0  1.5  0:00.03 /usr/sbin/grafana-server --config=/etc/grafana/grafana.ini --pidfile=/var/run/grafana/grafana-server.pid cfg:default.paths.logs=/var/log/
 1241 telegraf   20   0  832M 18420  7440 S  0.0  1.9  0:07.99 /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d
  667 root       20   0  916M  5004  1464 S  0.0  0.5  0:00.43 /usr/bin/containerd
F1Help  F2Setup F3SearchF4FilterF5Tree  F6SortByF7Nice -F8Nice +F9Kill  F10Quit



-------------------------------------------------------------------------------
  2019-07-07 13:25:02
-------------------------------------------------------------------------------


  1  [||||||||||||||||||||||||||                                                           28.0%]   Tasks: 36, 147 thr; 3 running
  2  [||||||||||||||||||||||||||||||||||||                                                 39.5%]   Load average: 3.57 3.85 3.79
  3  [||||||||||||||||||||||||||||||||||||||||||||||||||                                   53.9%]   Uptime: 00:47:45
  4  [|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||                76.3%]
  Mem[||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||                     310M/926M]
  Swp[||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||99.4M/100.0M]

  PID USER      PRI  NI  VIRT   RES   SHR S CPU% MEM%   TIME+  Command
 4306 influxdb   20   0 1972M  314M  123M S 189. 34.0  1:08.90 /usr/bin/influxd -config /etc/influxdb/influxdb.conf
 4316 influxdb   20   0 1972M  314M  123M R 99.5 34.0  0:14.78 /usr/bin/influxd -config /etc/influxdb/influxdb.conf
 4313 influxdb   20   0 1972M  314M  123M S 35.6 34.0  0:09.87 /usr/bin/influxd -config /etc/influxdb/influxdb.conf
 4314 influxdb   20   0 1972M  314M  123M S 27.7 34.0  0:10.05 /usr/bin/influxd -config /etc/influxdb/influxdb.conf
  559 telegraf   20   0  832M 19016  7712 S  4.0  2.0  3:10.10 /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d
  740 telegraf   20   0  832M 19016  7712 S  3.3  2.0  0:07.75 /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d
 4270 pi         20   0  6372  3060  2072 R  2.0  0.3  0:01.62 htop
  340 mongodb    20   0  232M  3192  2460 S  1.3  0.3  0:35.51 /usr/bin/mongod --config /etc/mongodb.conf
  663 telegraf   20   0  832M 19016  7712 S  0.7  2.0  0:21.13 /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d
  451 mongodb    20   0  232M  3192  2460 S  0.7  0.3  0:14.66 /usr/bin/mongod --config /etc/mongodb.conf
 4307 influxdb   20   0 1972M  314M  123M S  0.7 34.0  0:00.20 /usr/bin/influxd -config /etc/influxdb/influxdb.conf
  445 mongodb    20   0  232M  3192  2460 S  0.7  0.3  0:05.32 /usr/bin/mongod --config /etc/mongodb.conf
 1248 telegraf   20   0  832M 19016  7712 S  0.0  2.0  0:12.55 /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d
 1250 telegraf   20   0  832M 19016  7712 S  0.0  2.0  0:12.64 /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d
  664 telegraf   20   0  832M 19016  7712 S  0.0  2.0  0:09.70 /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d
 1241 telegraf   20   0  832M 19016  7712 S  0.0  2.0  0:08.22 /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d
  542 root       20   0  929M  7600  1052 S  0.0  0.8  0:04.49 /usr/bin/dockerd -H unix://
 3131 pi         20   0 11664   920   644 S  0.0  0.1  0:00.30 sshd: pi@pts/0
  764 unifi      20   0 1112M 20212  1832 R  0.0  2.1  0:04.79 unifi -cwd /usr/lib/unifi -home /usr/lib/jvm/jdk-8-oracle-arm32-vfp-hflt/jre -cp /usr/share/java/commons-daemon.jar:/usr/lib/unifi/lib/ac
 2910 telegraf   20   0  832M 19016  7712 S  0.0  2.0  0:04.93 /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d
 1057 telegraf   20   0  832M 19016  7712 S  0.0  2.0  0:11.69 /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d
 1234 telegraf   20   0  832M 19016  7712 S  0.0  2.0  0:07.79 /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d
 1236 telegraf   20   0  832M 19016  7712 S  0.0  2.0  0:13.93 /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d
  671 telegraf   20   0  832M 19016  7712 S  0.0  2.0  0:11.35 /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d
 1066 telegraf   20   0  832M 19016  7712 S  0.0  2.0  0:09.36 /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d
  116 root       20   0 29168  3012  2780 S  0.0  0.3  0:42.06 /lib/systemd/systemd-journald
 1239 telegraf   20   0  832M 19016  7712 S  0.0  2.0  0:08.07 /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d
 3627 telegraf   20   0  832M 19016  7712 S  0.0  2.0  0:01.80 /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d
  676 root       20   0  929M  7600  1052 S  0.0  0.8  0:00.47 /usr/bin/dockerd -H unix://
  659 unifi      20   0 1112M 20212  1832 S  0.0  2.1  0:15.84 unifi -cwd /usr/lib/unifi -home /usr/lib/jvm/jdk-8-oracle-arm32-vfp-hflt/jre -cp /usr/share/java/commons-daemon.jar:/usr/lib/unifi/lib/ac
 1081 telegraf   20   0  832M 19016  7712 S  0.0  2.0  0:14.87 /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d
  345 root       20   0 23756  1036   556 S  0.0  0.1  0:11.52 /usr/sbin/rsyslogd -n
  543 grafana    20   0  924M 13820  3420 S  0.0  1.5  0:06.82 /usr/sbin/grafana-server --config=/etc/grafana/grafana.ini --pidfile=/var/run/grafana/grafana-server.pid cfg:default.paths.logs=/var/log/

在这种状态下,我什至无法运行Influx CLI:

 $ influx
Failed to connect to http://localhost:8086: Get http://localhost:8086/ping: dial tcp [::1]:8086: connect: connection refused
Please check your connection settings and ensure 'influxd' is running.

我发现influxdb现在使用日记记录,因此日志由sudo journalctl -u influxdb.service给出。我已经用到目前为止的发现更新了这个问题。

事实证明influxdb不写日志文件;它使用日志记录。

转储日志表明服务正在快速启动,开始进行一些压缩,然后耗尽内存。发生这种情况时,它将关闭...然后重新启动。

Jul 14 02:31:43 twang influxd[4139]: ts=2019-07-14T01:31:43.096464Z lvl=info msg="Compacting file" log_id=0GcXWe5l000 engine=tsm1 tsm1_strategy=full tsm1_optimize=false trace_id=0GcXZGU0000 op_name=ts
Jul 14 02:31:43 twang influxd[4139]: ts=2019-07-14T01:31:43.096497Z lvl=info msg="Compacting file" log_id=0GcXWe5l000 engine=tsm1 tsm1_strategy=full tsm1_optimize=false trace_id=0GcXZGU0000 op_name=ts
Jul 14 02:31:43 twang influxd[4139]: ts=2019-07-14T01:31:43.096198Z lvl=info msg="TSM compaction (start)" log_id=0GcXWe5l000 engine=tsm1 tsm1_strategy=full tsm1_optimize=false trace_id=0GcXZGU0001 op_
Jul 14 02:31:43 twang influxd[4139]: ts=2019-07-14T01:31:43.097520Z lvl=info msg="Beginning compaction" log_id=0GcXWe5l000 engine=tsm1 tsm1_strategy=full tsm1_optimize=false trace_id=0GcXZGU0001 op_na
Jul 14 02:31:43 twang influxd[4139]: ts=2019-07-14T01:31:43.097611Z lvl=info msg="Compacting file" log_id=0GcXWe5l000 engine=tsm1 tsm1_strategy=full tsm1_optimize=false trace_id=0GcXZGU0001 op_name=ts
Jul 14 02:31:43 twang influxd[4139]: ts=2019-07-14T01:31:43.097652Z lvl=info msg="Compacting file" log_id=0GcXWe5l000 engine=tsm1 tsm1_strategy=full tsm1_optimize=false trace_id=0GcXZGU0001 op_name=ts
Jul 14 02:31:43 twang influxd[4139]: ts=2019-07-14T01:31:43.097691Z lvl=info msg="Compacting file" log_id=0GcXWe5l000 engine=tsm1 tsm1_strategy=full tsm1_optimize=false trace_id=0GcXZGU0001 op_name=ts
Jul 14 02:31:43 twang influxd[4139]: ts=2019-07-14T01:31:43.097726Z lvl=info msg="Compacting file" log_id=0GcXWe5l000 engine=tsm1 tsm1_strategy=full tsm1_optimize=false trace_id=0GcXZGU0001 op_name=ts
:
:
:
Jul 14 01:55:08 twang influxd[1756]: ts=2019-07-14T00:55:08.256884Z lvl=info msg="TSM compaction (start)" log_id=0GcVQfaG000 engine=tsm1 tsm1_strategy=full tsm1_optimize=false trace_id=0GcVTIt0000 op_
Jul 14 01:55:08 twang influxd[1756]: ts=2019-07-14T00:55:08.288481Z lvl=info msg="Beginning compaction" log_id=0GcVQfaG000 engine=tsm1 tsm1_strategy=full tsm1_optimize=false trace_id=0GcVTIt0000 op_na
Jul 14 01:55:08 twang influxd[1756]: ts=2019-07-14T00:55:08.290445Z lvl=info msg="Compacting file" log_id=0GcVQfaG000 engine=tsm1 tsm1_strategy=full tsm1_optimize=false trace_id=0GcVTIt0000 op_name=ts
Jul 14 01:55:08 twang influxd[1756]: ts=2019-07-14T00:55:08.292220Z lvl=info msg="Compacting file" log_id=0GcVQfaG000 engine=tsm1 tsm1_strategy=full tsm1_optimize=false trace_id=0GcVTIt0000 op_name=ts
Jul 14 01:55:08 twang influxd[1756]: ts=2019-07-14T00:55:08.293889Z lvl=info msg="Compacting file" log_id=0GcVQfaG000 engine=tsm1 tsm1_strategy=full tsm1_optimize=false trace_id=0GcVTIt0000 op_name=ts
Jul 14 01:55:08 twang influxd[1756]: ts=2019-07-14T00:55:08.295738Z lvl=info msg="Compacting file" log_id=0GcVQfaG000 engine=tsm1 tsm1_strategy=full tsm1_optimize=false trace_id=0GcVTIt0000 op_name=ts
Jul 14 01:55:08 twang influxd[1756]: ts=2019-07-14T00:55:08.297635Z lvl=info msg="Compacting file" log_id=0GcVQfaG000 engine=tsm1 tsm1_strategy=full tsm1_optimize=false trace_id=0GcVTIt0000 op_name=ts
Jul 14 01:55:11 twang influxd[1756]: [httpd] ::1 - username [14/Jul/2019:01:55:10 +0100] "POST /write?consistency=any&db=telegraf HTTP/1.1" 204 0 "-" "telegraf" 07902d7a-a5d2-11e9-8001-b827eb6b4e27 11
Jul 14 01:55:11 twang influxd[1756]: [httpd] ::1 - - [14/Jul/2019:01:55:10 +0100] "POST /write?db=telegraf HTTP/1.1" 204 0 "-" "Telegraf/1.11.1" 079a21ac-a5d2-11e9-8002-b827eb6b4e27 1683504
Jul 14 01:55:12 twang influxd[1756]: [httpd] ::1 - username [14/Jul/2019:01:55:11 +0100] "POST /write?consistency=any&db=telegraf HTTP/1.1" 204 0 "-" "telegraf" 08451343-a5d2-11e9-8003-b827eb6b4e27 17
Jul 14 01:55:12 twang influxd[1756]: [httpd] ::1 - - [14/Jul/2019:01:55:11 +0100] "POST /write?db=telegraf HTTP/1.1" 204 0 "-" "Telegraf/1.11.1" 089bdbca-a5d2-11e9-8004-b827eb6b4e27 1182542
Jul 14 01:55:17 twang influxd[1756]: runtime: out of memory: cannot allocate 8192-byte block (540016640 in use)
Jul 14 01:55:17 twang influxd[1756]: fatal error: out of memory
Jul 14 01:55:17 twang influxd[1756]: runtime: out of memory: cannot allocate 8192-byte block (540016640 in use)
Jul 14 01:55:17 twang influxd[1756]: fatal error: out of memory

现在,我必须弄清楚如何摆脱困境。有任何猜测吗?

1 个答案:

答案 0 :(得分:0)

出于某种奇怪的原因,我的解决方案是停止服务,用手开始涌入,让它运行一会儿,然后CPU负载很高。 :D