Question

尝试在linux计算机上确定处理器队列长度（准备运行但当前不运行的进程数）。 Windows中有一个针对此指标的WMI调用，但对Linux不太了解我正在尝试挖掘/ proc和'top'获取信息。有没有办法确定cpu的队列长度？

编辑添加：Microsoft关于其指标的词语：“由于当前正在运行的另一个活动线程，已准备好但无法在处理器上运行的一个或多个线程的集合称为处理器队列。”

Answer 1

sar -q将报告队列长度，任务列表长度和三个负载平均值。

示例：

matli@tornado:~$ sar -q 1 0
Linux 2.6.27-9-generic (tornado)    01/13/2009  _i686_

11:38:32 PM   runq-sz  plist-sz   ldavg-1   ldavg-5  ldavg-15
11:38:33 PM         0       305      1.26      0.95      0.54
11:38:34 PM         4       305      1.26      0.95      0.54
11:38:35 PM         1       306      1.26      0.95      0.54
11:38:36 PM         1       306      1.26      0.95      0.54
^C

Answer 2

的vmstat

procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 2  0 256368  53764  75980 220564    2   28    60    54  774 1343 15  4 78  2

第一列（r）是我的机器上的运行队列 - 2

编辑：很惊讶没有办法获得数字

快速'肮脏的方式来获取号码（在不同的机器上可能会有所不同）：

  vmstat|tail -1|cut -d" " -f2

Answer 3

uptime将为您提供最近的平均负载，大约是活动进程的平均数。 uptime报告过去1分钟，5分钟和15分钟的平均负载。这是一个每系统测量，而不是每个CPU。

不确定Windows中的处理器队列长度是多少，希望它足够接近这个？

Answer 4

您寻找的指标存在于/proc/schedstat中。

sched-stats.txt in the kernel source.中描述了该文件的格式，具体来说，cpu<N>行是您想要的：

CPU statistics
--------------
cpu<N> 1 2 3 4 5 6 7 8 9


First field is a sched_yield() statistic:
     1) # of times sched_yield() was called


Next three are schedule() statistics:
     2) This field is a legacy array expiration count field used in the O(1)
    scheduler. We kept it for ABI compatibility, but it is always set to zero.
     3) # of times schedule() was called
     4) # of times schedule() left the processor idle


Next two are try_to_wake_up() statistics:
     5) # of times try_to_wake_up() was called
     6) # of times try_to_wake_up() was called to wake up the local cpu


Next three are statistics describing scheduling latency:
     7) sum of all time spent running by tasks on this processor (in jiffies)
     8) sum of all time spent waiting to run by tasks on this processor (in
        jiffies)
     9) # of timeslices run on this cpu

尤其是字段8。要查找运行队列长度，您将：

观察每个CPU的字段8，并记录该值。
等待一段时间。
再次观察每个CPU的字段8，并计算该值增加了多少。
将该差异除以等待的时间间隔长度（以吉菲尔为单位）除以Little's Law，得出该时间间隔内调度程序运行队列的平均长度。

不幸的是，我不知道有任何实用程序可以自动执行此过程，通常可以在Linux发行版中安装或打包该程序。我没有使用过它，但是内核文档建议使用http://eaglet.rain.com/rick/linux/schedstat/v12/latency.c，但不幸的是，它是指一个不再可解析的域。幸运的是，它可以使用on the wayback machine。

为什么不sar或vmstat？

这些工具报告当前可运行的进程数。当然，如果此数目大于CPU的数目，则其中一些必须处于等待状态。但是，由于多种原因，即使进程数少于CPU数，进程仍然可以等待。

进程可能被固定到特定的CPU。
调度程序可能出于NUMA优化原因而决定在特定CPU上调度进程以更好地利用缓存。
调度程序可能会故意使一个CPU空闲，以留出更多时间在共享同一执行核心的另一个CPU上进行竞争，优先级更高的进程（超线程优化）。
由于各种硬件和软件原因，硬件中断只能在特定的CPU上处理。

此外，可运行进程的数量仅在瞬间进行采样。在许多情况下，该数字可能会快速波动，并且可能在对指标进行采样的时间之间发生争用。

这意味着可运行的进程数减去CPU数并不是CPU争用的可靠指标。

如何在linux中找到处理器队列长度

4 个答案: