/ proc / pid / stat文件中的cstime错误

时间:2012-01-09 08:14:29

标签: linux xen stat

stime文件中的

cstime/proc/pid/stat非常庞大,没有任何意义。但有时只是某些进程有错误cstime。如下:

# ps -eo pid,ppid,stime,etime,time,%cpu,%mem,command |grep nsc
4815     1 Jan08  1-01:20:02 213503-23:34:33 20226149  0.1 /usr/sbin/nscd
#
# cat /proc/4815/stat
4815 (nscd) S 1 4815 4815 0 -1 4202560 2904 0 0 0 21 1844674407359 0 0 20 0 9 0 4021 241668096 326 18446744073709551615 139782748139520 139782748261700 140737353849984 140737353844496 139782734487251 0 0 3674112 16390 18446744073709551615 0 0 17 1 0 0 0 0 0

您可以看到proc 4815的stime nscd1844674407359,等于213503-23:34:33,但刚刚为1-01:20:02生效。< / p>

另一个问题进程错误cstime如下:

一个bash fork一个sh,它分叉睡觉。

8155 (bash) S 3124 8155 8155 0 -1 4202752 1277 6738 0 0 3 0 4 1844674407368 20 0 1 0 1738175 13258752 451 18446744073709551615 4194304 4757932 140736528897536 140736528896544 47722675403157 0 65536 4100 65538 18446744071562341410 0 0 17 5 0 0 0 0 0

8184 (sh) S 8155 8155 8155 0 -1 4202496 475 0 0 0 0 0 0 0 20 0 1 0 1738185 11698176 357 18446744073709551615 4194304 4757932 140733266239472 140733266238480 47964680542613 0 65536 4100 65538 18446744071562341410 0 0 17 6 0 0 0 0 0

8185 (sleep) S 8184 8155 8155 0 -1 4202496 261 0 0 0 0 0 0 0 20 0 1 0 1738186 8577024 177 18446744073709551615 4194304 4212204 140734101195248 140734101194776 48002231427168 0 0 4096 0 0 0 0 17 7 0 0 0 0 0

所以你可以看到proc bash中的cstime是1844674407368,这远远大于其子节点的总cpu时间。

我的服务器有一个Intel(R)Xeon(R)CPU E5620 @ 2.40GHz,它是4核和8个线程。操作系统是Suse Linux Enterprise Server SP1 x86_64,如下所示

# lsb_release  -a
LSB Version:    core-2.0-noarch:core-3.2-noarch:core-4.0-noarch:core-2.0-x86_64:core-3.2-x86_64:core-4.0-x86_64:desktop-4.0-amd64:desktop-4.0-noarch:graphics-2.0-amd64:graphics-2.0-noarch:graphics-3.2-amd64:graphics-3.2-noarch:graphics-4.0-amd64:graphics-4.0-noarch
Distributor ID: SUSE LINUX
Description:    SUSE Linux Enterprise Server 11 (x86_64)
Release:    11
Codename:   n/a
#
# uname -a
Linux node2 2.6.32.12-0.7-xen #1 SMP 2010-05-20 11:14:20 +0200 x86_64 x86_64 x86_64 GNU/Linux

这是内核的问题吗?有人可以帮忙解决吗?

1 个答案:

答案 0 :(得分:3)

我怀疑你可能只是看到了一个内核错误。更新到SLES的最新提供的更新内核(类似于2.6.32.42)并查看它是否仍然出现。顺便说一句,这是非常高的,而不是cstime,这是非常高的 - 实际上,看起来很接近你会发现它是一个值,就像字符串截断18446744073709551615(2 ^ 64-1)±几个时钟偏移。

pid_nr: 4815
tcomm: (nscd)
state: S
ppid: 1
pgid: 4815
sid: 4815
tty_nr: 0
tty_pgrp: -1
task_flags: 4202560 / 0x402040
min_flt: 2904
cmin_flt: 0
max_flt: 0
cmax_flt: 0
utime: 21 clocks (= 21 clocks) (= 0.210000 s)
stime: 1844674407359 clocks (= 1844674407359 clocks) (= 18446744073.590000 s)
cutime: 0 clocks (= 0 clocks) (= 0.000000 s)
cstime: 0 clocks (= 0 clocks) (= 0.000000 s)
priority: 20
nice: 0
num_threads: 9
always-zero: 0
start_time: 4021
vsize: 241668096
get_mm_rss: 326
rsslim: 18446744073709551615 / 0xffffffffffffffff
mm_start_code: 139782748139520 / 0x7f21b50c7000
mm_end_code: 139782748261700 / 0x7f21b50e4d44
mm_start_stack: 140737353849984 / 0x7ffff7fb9c80
esp: 140737353844496 / 0x7ffff7fb8710
eip: 139782734487251 / 0x7f21b43c1ed3
obsolete-pending-signals: 0 / 0x0
obsolete-blocked-signals: 0 / 0x0
obsolete-sigign: 3674112 / 0x381000
obsolete-sigcatch: 16390 / 0x4006
wchan: 18446744073709551615 / 0xffffffffffffffff
always-zero: 0
always-zero: 0
task_exit_signal: 17
task_cpu: 1
task_rt_priority: 0
task_policy: 0
delayacct_blkio_ticks: 0
gtime: 0 clocks (= 0 clocks) (= 0.000000 s)
cgtime: 0 clocks (= 0 clocks) (= 0.000000 s)