这是当前的设置:
Intranet中的一台redis服务器和2台Node js服务器,每台服务器在DMZ中运行8个PM2实例。节点服务器和redis服务器之间有一个防火墙。 我们想知道为什么PM2每小时左右都会重新启动,以及为什么redis服务器中的连接会不断增加。 我们查看了pm2日志,我们发现的一些错误是:
这是redis conf:
# General
daemonize yes
pidfile "/var/run/redis/6379.pid"
dir "/apps/redis/6379"
port 6379
bind 0.0.0.0
timeout 0
tcp-keepalive 0
tcp-backlog 511
loglevel notice
logfile "/apps/log/redis/reids.log"
syslog-enabled yes
syslog-ident "redis_6379"
syslog-facility user
databases 16
# Snapshotting
save 900 1
save 300 10
save 60 10000
stop-writes-on-bgsave-error yes
rdbcompression yes
rdbchecksum yes
dbfilename "dump.rdb"
# Replication
slave-serve-stale-data yes
slave-read-only yes
repl-disable-tcp-nodelay no
slave-priority 100
min-slaves-max-lag 10
# Security
# Limits
maxclients 10000
maxmemory-policy noeviction
# Append Only Mode
appendonly no
appendfilename "appendonly.aof"
appendfsync everysec
no-appendfsync-on-rewrite no
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
# Lua
lua-time-limit 5000
# Slow Log
slowlog-log-slower-than 10000
slowlog-max-len 128
# Event Notification
notify-keyspace-events ""
# Advanced
hash-max-ziplist-entries 512
hash-max-ziplist-value 64
list-max-ziplist-entries 512
list-max-ziplist-value 64
set-max-intset-entries 512
zset-max-ziplist-entries 128
zset-max-ziplist-value 64
activerehashing yes
client-output-buffer-limit normal 0 0 0
client-output-buffer-limit slave 256mb 64mb 60
client-output-buffer-limit pubsub 32mb 8mb 60
hz 10
aof-rewrite-incremental-fsync yes
# Generated by CONFIG REWRITE
这是在pm2和netstat中观察到的不稳定行为:
...
...
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:hacl-gs ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:13039 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:23784 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:32191 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:18506 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:30892 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:26435 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:29383 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:28424 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:18174 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:3364 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:28449 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:20975 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:23668 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:21041 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:23206 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:14224 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:29325 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:28642 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:aap ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:14575 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:11651 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:5982 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:10445 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:22765 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:12192 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:8926 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:32150 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:10471 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:29948 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:25831 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:28012 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:5119 ESTABLISHED
[apps@vidmarredispdb4 redis]$ netstat | grep 6379 | grep 226 | wc -l
168
[apps@vidmarredispdb4 redis]$ netstat | grep 6379 | grep 227 | wc -l
40
pm2重启并且时间不稳定:
[apps@vidmarnjspapp1 ~]$ pm2 list
┌─────────────┬────┬─────────┬───────┬────────┬─────────┬────────┬─────┬───────────┬──────────┐
│ App name │ id │ mode │ pid │ status │ restart │ uptime │ cpu │ mem │ watching │
├─────────────┼────┼─────────┼───────┼────────┼─────────┼────────┼─────┼───────────┼──────────┤
│ Pm2Http9615 │ 0 │ fork │ 23300 │ online │ 1 │ 39h │ 0% │ 25.1 MB │ disabled │
│ video-api │ 1 │ cluster │ 13434 │ online │ 3 │ 2h │ 0% │ 67.7 MB │ disabled │
│ video-api │ 2 │ cluster │ 23306 │ online │ 1 │ 39h │ 0% │ 77.8 MB │ disabled │
│ video-api │ 3 │ cluster │ 23312 │ online │ 1 │ 39h │ 0% │ 71.5 MB │ disabled │
│ video-api │ 4 │ cluster │ 6073 │ online │ 4 │ 20h │ 0% │ 68.1 MB │ disabled │
│ video-api │ 5 │ cluster │ 18850 │ online │ 2 │ 24h │ 0% │ 74.8 MB │ disabled │
│ video-api │ 6 │ cluster │ 23342 │ online │ 1 │ 39h │ 0% │ 74.2 MB │ disabled │
│ video-api │ 7 │ cluster │ 27569 │ online │ 2 │ 28h │ 0% │ 71.5 MB │ disabled │
│ video-api │ 8 │ cluster │ 23391 │ online │ 1 │ 39h │ 0% │ 78.2 MB │ disabled │
└─────────────┴────┴─────────┴───────┴────────┴─────────┴────────┴─────┴───────────┴──────────┘
这是tcpdump日志:
NODE版本 - 6.11.2 REDIS版本 - 3.2.9 PM2版本 - 2.4.6
任何有助于理解这种关于连接数的不稳定行为的帮助都将受到高度赞赏。
答案 0 :(得分:0)
可能的错误:
process.on('uncaughtException', function(e) {
console.log('Uncaught Exception...');
console.log(e.stack);
// disconnect redis or other connection
process.exit(99);
});
确保仅创建一次redis连接并在每个请求中使用它。 节点进程遇到任何错误时关闭redis连接。
让我知道您是如何在代码库中创建redis连接的。
此外,由于这个原因可能会导致内存泄漏,因为pm2列表在每个小时后表示它使用更多内存
例子 -
前2小时│2小时│0%│67.7MB
39小时后│39h│0%│78.2MB