Question

$ time foo
real        0m0.003s
user        0m0.000s
sys         0m0.004s
$

“真实”，“用户”和“系统”在时间输出中意味着什么？

在对我的应用进行基准测试时哪一个有意义？

Answer 1

要展开accepted answer，我只是想提供另一个原因real≠user + sys。

请注意，real表示实际经过的时间，而user和sys值表示CPU执行时间。因此，在多核系统上，user和/或sys时间（以及它们的总和）实际上超过实时。例如，在我正在为类运行的Java应用程序中，我得到了这组值：

real    1m47.363s
user    2m41.318s
sys     0m4.013s

Answer 2

•真实：从开始到结束运行过程所花费的实际时间，好像是由带有秒表的人测量的

•用户：计算期间所有CPU花费的累计时间

• sys ：所有CPU在系统相关任务（如内存分配）期间所累积的时间。

请注意，有时user + sys可能比real更强，因为多个处理器可以并行工作。

Answer 3

Real显示流程的总周转时间; 用户显示用户定义指令的执行时间和Sys是时候执行系统调用了！

实时包括等待时间（I / O的等待时间等）

Answer 4

最低限度可运行的POSIX C示例

为了使事情更具体，我想用一些最小的C测试程序来举例说明time的一些极端情况。

所有程序均可通过以下方式编译和运行：

gcc -ggdb3 -o main.out -pthread -std=c99 -pedantic-errors -Wall -Wextra main.c
time ./main.out

并已在Ubuntu 18.10，GCC 8.2.0，glibc 2.28，Linux内核4.18，ThinkPad P51笔记本电脑，Intel Core i7-7820HQ CPU（4核/ 8线程），2个Samsung M471A2K43BB1-CRC RAM（2个16GiB）中进行了测试）。

睡眠

不忙睡眠不会计入user或sys中，仅占real。

例如，一个睡眠一秒钟的程序：

#define _XOPEN_SOURCE 700
#include <stdlib.h>
#include <unistd.h>

int main(void) {
    sleep(1);
    return EXIT_SUCCESS;
}

GitHub upstream。

输出类似：

real    0m1.003s
user    0m0.001s
sys     0m0.003s

对于在IO上可用而被阻止的程序也是如此。

例如，以下程序等待用户输入字符并按Enter：

#include <stdio.h>
#include <stdlib.h>

int main(void) {
    printf("%c\n", getchar());
    return EXIT_SUCCESS;
}

GitHub upstream。

如果想要一秒钟的回合，则输出类似于睡眠示例的内容：

real    0m1.003s
user    0m0.001s
sys     0m0.003s

多线程

以下示例对niters个线程进行了nthreads个无用的CPU繁重工作的迭代：

#define _XOPEN_SOURCE 700
#include <assert.h>
#include <inttypes.h>
#include <pthread.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

uint64_t niters;

void* my_thread(void *arg) {
    uint64_t *argument, i, result;
    argument = (uint64_t *)arg;
    result = *argument;
    for (i = 0; i < niters; ++i) {
        result = (result * result) - (3 * result) + 1;
    }
    *argument = result;
    return NULL;
}

int main(int argc, char **argv) {
    size_t nthreads;
    pthread_t *threads;
    uint64_t rc, i, *thread_args;

    /* CLI args. */
    if (argc > 1) {
        niters = strtoll(argv[1], NULL, 0);
    } else {
        niters = 1000000000;
    }
    if (argc > 2) {
        nthreads = strtoll(argv[2], NULL, 0);
    } else {
        nthreads = 1;
    }
    threads = malloc(nthreads * sizeof(*threads));
    thread_args = malloc(nthreads * sizeof(*thread_args));

    /* Create all threads */
    for (i = 0; i < nthreads; ++i) {
        thread_args[i] = i;
        rc = pthread_create(
            &threads[i],
            NULL,
            my_thread,
            (void*)&thread_args[i]
        );
        assert(rc == 0);
    }

    /* Wait for all threads to complete */
    for (i = 0; i < nthreads; ++i) {
        rc = pthread_join(threads[i], NULL);
        assert(rc == 0);
        printf("%" PRIu64 " %" PRIu64 "\n", i, thread_args[i]);
    }

    free(threads);
    free(thread_args);
    return EXIT_SUCCESS;
}

GitHub upstream + plot code。

然后，在我的8个超线程CPU上，针对固定的10 ^ 10迭代，将墙，用户和sys绘制为线程数的函数：

从图中我们可以看到：

对于CPU密集型单核应用程序，wall和用户大致相同
对于2核，用户大约是2倍墙，这意味着在所有线程中计算用户时间。

用户基本上增加了一倍，而墙保持不变。
这最多可以持续8个线程，这与我计算机中超线程的数量相匹配。

8点以后，墙也开始增加，因为我们没有任何额外的CPU在给定的时间内进行更多的工作！

此时的比率平稳。

使用sendfile

我能想到的最重的sys工作负载是使用sendfile，它在内核空间Copy a file in a sane, safe and efficient way

上执行文件复制操作

因此，我想象此内核memcpy将占用大量CPU。

首先，我使用以下命令初始化一个10GiB随机文件：

dd if=/dev/urandom of=sendfile.in.tmp bs=1K count=10M

然后运行代码：

#define _GNU_SOURCE
#include <assert.h>
#include <fcntl.h>
#include <stdlib.h>
#include <sys/sendfile.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>

int main(int argc, char **argv) {
    char *source_path, *dest_path;
    int source, dest;
    struct stat stat_source;
    if (argc > 1) {
        source_path = argv[1];
    } else {
        source_path = "sendfile.in.tmp";
    }
    if (argc > 2) {
        dest_path = argv[2];
    } else {
        dest_path = "sendfile.out.tmp";
    }
    source = open(source_path, O_RDONLY);
    assert(source != -1);
    dest = open(dest_path, O_WRONLY | O_CREAT | O_TRUNC, S_IRUSR | S_IWUSR);
    assert(dest != -1);
    assert(fstat(source, &stat_source) != -1);
    assert(sendfile(dest, source, 0, stat_source.st_size) != -1);
    assert(close(source) != -1);
    assert(close(dest) != -1);
    return EXIT_SUCCESS;
}

GitHub upstream。

基本上可以提供预期的系统时间：

real    0m2.175s
user    0m0.001s
sys     0m1.476s

我也很想知道time是否可以区分不同进程的系统调用，所以我尝试了：

time ./sendfile.out sendfile.in1.tmp sendfile.out1.tmp &
time ./sendfile.out sendfile.in2.tmp sendfile.out2.tmp &

结果是：

real    0m3.651s
user    0m0.000s
sys     0m1.516s

real    0m4.948s
user    0m0.000s
sys     0m1.562s

两者的系统时间与单个进程的系统时间大致相同，但是挂墙时间较长，因为进程可能在争用磁盘读取访问权。

因此看来，实际上是由哪个进程开始了给定的内核工作。

Bash源代码

在Ubuntu上仅执行time <cmd>时，它会使用Bash关键字，如下所示：

type time

输出：

time is a shell keyword

因此我们在Bash 4.19源代码中的grep源代码用于输出字符串：

git grep '"user\b'

使我们进入execute_cmd.c函数time_command，该函数使用：

gettimeofday()和getrusage()（如果同时可用）
times()否则

所有这些都是Linux system calls和POSIX functions。

GNU Coreutils源代码

如果我们称其为：

/usr/bin/time

然后使用GNU Coreutils实现。

这有点复杂，但是相关资料似乎在resuse.c上，并且确实如此：

非POSIX BSD wait3呼叫（如果可用）
times和其他gettimeofday

Answer 5

很简单地说，我喜欢这样考虑：

real是运行命令所花费的实际时间（就像您已经用秒表计时一样）
user和sys是CPU必须执行多少“工作”才能执行命令。这种“工作”以时间单位表示。

一般来说：

user是CPU为运行命令的代码所做的工作量
sys是CPU处理“系统开销”型任务（例如分配内存，文件I / O等）以支持正在运行的命令所要做的工作

由于这最后两次都在计算“已完成”的工作，因此它们不包括线程可能等待的时间（例如等待另一个进程或磁盘I / O完成）。

real是衡量实际运行时间的指标，而不是“工作”的指标，因此它确实包括等待所花费的任何时间。

Answer 6

我想提到其他一些情况，其中实时性远大于user + sys。我已经创建了一个简单的服务器，可以在很长一段时间后响应

real 4.784
user 0.01s
sys  0.01s

问题在于，在这种情况下，进程将等待用户站点或系统中都没有的响应。

运行find命令时会发生类似的情况。在这种情况下，时间主要花费在请求和获得SSD的响应上。

'real'，'user'和'sys'在time（1）的输出中意味着什么？

7 个答案: