如何检查导致性能问题的原因

时间:2015-11-10 10:05:19

标签: linux node.js performance io

我继承了一些性能问题的服务器。它运行节点js,nginx,基本MEAN堆栈。 (虽然在另一台服务器上有DB)

每当我复制一个文件(大小约为150MB的日志文件)或vim一个具有该大小的文件时,“iostat -x 1”的输出将如下所示

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
scd0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sda               0.00  8137.62    0.00   49.50     0.00 29924.75   604.48    17.32  123.54  16.50  81.68

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.59    0.00   24.34    0.00    0.00   74.07

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
scd0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00   39.39     0.00 36606.06   929.23     2.42  351.64   1.87   7.37

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           2.78    0.00   24.44    0.00    0.00   72.78

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
scd0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00

我提出这个问题的主要原因是因为我的nodejs服务的有时一个简单的RESTFUL API会响应缓慢(从10ms到500ms)并且我不确定要查找什么或检查是什么导致了这个问题。

复制到另一台服务器的相同代码库将顺利运行而不会出现问题,上面提到的问题是唯一可以发现服务器可能出现问题的导致,但我不确定它是什么。

代码如下:

在一个文件statistic / index.js中:

var tracker = function (data) {
  piwik.tracker(data);
};

exports.tracker = tracker;

在另一个文件统计/ piwik.js中:

exports.tracker = function (data) {
  var params = {};

  /** Assign params with data - just static string **/

  /** API_URL is another machine in same LAN **/
  needle.post(API_URL, params, function (err, response, body) {
    if (err || (response.statusCode !== 200 && response.statusCode !== 204)) {
      util.error(err);
    }
  });
};

在调用上述路由/ getuser.js的文件中:

exports.getUser = function (req, res) {

  async.auto({
    get_user: function (cb) {
      /** Read user data from DB **/
      cb();
    },
    record_statistic: ['get_user', function (cb) {
      statistic.tracker({ /** Pass static string data **/});
      cb();
    }]
  }, function (err) {
    if (err) {
      res.json(err);
    } else {
      res.json();
    }
  });
};

我提到上述代码的原因是因为当我评论出statistic.tracker({/ **传递静态字符串数据** /})时;该功能将在50ms内响应,但如果我在大多数时间内包含它,它将在100ms - 500ms之间响应。我甚至对“针”HTTP帖子进行了时间戳检查,它在10到20ms内响应。

当我复制文件(cp -p x.txt y.txt)时,特别是当它是> 100MB文件,它也会减慢我的节点js。但即使我在服务器上没有做任何事情,除了nginx和节点js监听请求,下面的代码将缓慢响应。 (如果我没有注明代码)

我怀疑IO但还有其他地方要检查?或者要注意什么?

以下是有关服务器的一些信息:

[ec2-user@tlp-backend logs]$ uname -a
Linux tlp-backend 2.6.32-71.el6.x86_64 #1 SMP Fri May 20 03:51:51 BST 2011 x86_64 x86_64 x86_64 GNU/Linux

[ec2-user@tlp-backend logs]$ cat /proc/scsi/scsi 
Attached devices:
Host: scsi1 Channel: 00 Id: 00 Lun: 00
  Vendor: NECVMWar Model: VMware IDE CDR10 Rev: 1.00
  Type:   CD-ROM                           ANSI  SCSI revision: 05
Host: scsi2 Channel: 00 Id: 00 Lun: 00
  Vendor: VMware,  Model: VMware Virtual S Rev: 1.0 
  Type:   Direct-Access                    ANSI  SCSI revision: 02
[ec2-user@tlp-backend logs]$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 30502
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 1024
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

[ec2-user@tlp-backend logs]$ more /proc/meminfo 
MemTotal:        3918960 kB
MemFree:          392260 kB
Buffers:          296116 kB
Cached:          1205652 kB
SwapCached:          364 kB
Active:          1725084 kB
Inactive:        1155564 kB
Active(anon):     949492 kB
Inactive(anon):   430528 kB
Active(file):     775592 kB
Inactive(file):   725036 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:       4095992 kB
SwapFree:        4092872 kB
Dirty:                28 kB
Writeback:             0 kB
AnonPages:       1378528 kB
Mapped:            29860 kB
Shmem:              1140 kB
Slab:             588628 kB
SReclaimable:     461108 kB
SUnreclaim:       127520 kB
KernelStack:        2296 kB
PageTables:        16940 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:     6055472 kB
Committed_AS:    1829320 kB
VmallocTotal:   34359738367 kB
VmallocUsed:      288456 kB
VmallocChunk:   34359446140 kB
HardwareCorrupted:     0 kB
AnonHugePages:    507904 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:       10240 kB
DirectMap2M:     4184064 kB

[ec2-user@tlp-backend logs]$ more /proc/cpuinfo 
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model       : 44
model name  : Intel(R) Xeon(R) CPU           E5645  @ 2.40GHz
stepping    : 2
cpu MHz     : 2400.000
cache size  : 12288 KB
physical id : 0
siblings    : 2
core id     : 0
cpu cores   : 2
apicid      : 0
initial apicid  : 0
fpu     : yes
fpu_exception   : yes
cpuid level : 11
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat 
pse36 clflush dts mmx fxsr sse sse2 ss ht syscall nx rdtscp lm constant_tsc arch
_perfmon pebs bts rep_good xtopology tsc_reliable nonstop_tsc aperfmperf unfair_
spinlock pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 x2apic popcnt aes hypervisor lah
f_lm ida arat
bogomips    : 4800.00
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:

processor   : 1
vendor_id   : GenuineIntel
cpu family  : 6
model       : 44
model name  : Intel(R) Xeon(R) CPU           E5645  @ 2.40GHz
stepping    : 2
cpu MHz     : 2400.000
cache size  : 12288 KB
physical id : 0
siblings    : 2
core id     : 1
cpu cores   : 2
apicid      : 1
initial apicid  : 1
fpu     : yes
fpu_exception   : yes
cpuid level : 11
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat 
pse36 clflush dts mmx fxsr sse sse2 ss ht syscall nx rdtscp lm constant_tsc arch
_perfmon pebs bts rep_good xtopology tsc_reliable nonstop_tsc aperfmperf unfair_
spinlock pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 x2apic popcnt aes hypervisor lah
f_lm ida arat
bogomips    : 4800.00
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:

processor   : 2
vendor_id   : GenuineIntel
cpu family  : 6
model       : 44
model name  : Intel(R) Xeon(R) CPU           E5645  @ 2.40GHz
stepping    : 2
cpu MHz     : 2400.000
cache size  : 12288 KB
physical id : 1
siblings    : 2
core id     : 0
cpu cores   : 2
apicid      : 2
initial apicid  : 2
fpu     : yes
fpu_exception   : yes
cpuid level : 11
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat 
pse36 clflush dts mmx fxsr sse sse2 ss ht syscall nx rdtscp lm constant_tsc arch
_perfmon pebs bts rep_good xtopology tsc_reliable nonstop_tsc aperfmperf unfair_
spinlock pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 x2apic popcnt aes hypervisor lah
f_lm ida arat
bogomips    : 4800.00
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:

processor   : 3
vendor_id   : GenuineIntel
cpu family  : 6
model       : 44
model name  : Intel(R) Xeon(R) CPU           E5645  @ 2.40GHz
stepping    : 2
cpu MHz     : 2400.000
cache size  : 12288 KB
physical id : 1
siblings    : 2
core id     : 1
cpu cores   : 2
apicid      : 3
initial apicid  : 3
fpu     : yes
fpu_exception   : yes
cpuid level : 11
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat 
pse36 clflush dts mmx fxsr sse sse2 ss ht syscall nx rdtscp lm constant_tsc arch
_perfmon pebs bts rep_good xtopology tsc_reliable nonstop_tsc aperfmperf unfair_
spinlock pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 x2apic popcnt aes hypervisor lah
f_lm ida arat
bogomips    : 4800.00
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:

 top - 18:02:58 up 26 days, 18:49,  2 users,  load average: 0.00, 0.00, 0.00
Tasks: 176 total,   1 running, 175 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0%us,  0.1%sy,  0.0%ni, 99.9%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   3918960k total,  3522368k used,   396592k free,   295340k buffers
Swap:  4095992k total,     3120k used,  4092872k free,  1201660k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND           
    1 root      20   0 19340 1248 1040 S  0.0  0.0   0:03.45 init               
    2 root      20   0     0    0    0 S  0.0  0.0   0:00.01 kthreadd           
    3 root      RT   0     0    0    0 S  0.0  0.0   0:02.55 migration/0        
    4 root      20   0     0    0    0 S  0.0  0.0   0:01.57 ksoftirqd/0        
    5 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 watchdog/0         
    6 root      RT   0     0    0    0 S  0.0  0.0   0:18.87 migration/1        
    7 root      20   0     0    0    0 S  0.0  0.0   0:01.34 ksoftirqd/1        
    8 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 watchdog/1         
    9 root      RT   0     0    0    0 S  0.0  0.0   0:01.90 migration/2        
   10 root      20   0     0    0    0 S  0.0  0.0   0:01.30 ksoftirqd/2        
   11 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 watchdog/2  

1 个答案:

答案 0 :(得分:0)

显然,主要问题来自代码本身。当您将async.auto与针包混合使用时,您需要明确说明{connection:' keep-alive'在HTTP标头中

在此处查看更多信息:https://github.com/tomas/needle/issues/148