Question

我正在编写一个分析器，其用例类似于

long getTiming() 
{
    long start = someGetTimeFunction();
    executeSomething();
    return someTimeFunction() - start;
}

无论我使用什么时间功能，它似乎都会增加很多开销。我已尝试gettimeofday()，clock_gettime()与CLOCK_MONOTONIC, CLOCK_PROCESS_CPUTIME_ID和CLOCK_THREAD_CPUTIME_ID，我尝试了一些程序集，我发现here要拨打rdtsc [INFO] [ OK ] X.TimeGetTimeOfDay (1165 ms) [INFO] [ OK ] X.TimeRdtscl (1208 ms) [INFO] [ OK ] X.TimeMonotomicGetTime (1536 ms) [INFO] [ OK ] X.TimeProcessGetTime (1575 ms) [INFO] [ OK ] X.TimeThreadGetTime (1522 ms)。

每次运行500,000，这是他们的成本：

TEST(X, TimeGetTimeOfDay)
{    
    for (int i = 0; i < 500000; i++) {
        timeval when;
        gettimeofday(&when, NULL);
    }
}

TEST(X, TimeRdtscl)
{
    for (int i = 0; i < 500000; i++) {
        unsigned long long when;
        rdtscl(&when);
    }
}

TEST(X, TimeMonotomicGetTime)
{
    for (int i = 0; i < 500000; i++) {
        struct timespec when;
        clock_gettime(CLOCK_MONOTONIC, &when);
    }
}

TEST(X, TimeProcessGetTime)
{
    for (int i = 0; i < 500000; i++) {
        struct timespec when;
        clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &when);
    }
}


TEST(X, TimeThreadGetTime)
{
    for (int i = 0; i < 500000; i++) {
        struct timespec when;
        clock_gettime(CLOCK_THREAD_CPUTIME_ID, &when);
    }
}

这是在macbook pro上运行的CentOS 5虚拟盒VM上。

由于我需要计算三角洲，我不需要绝对时间。并且没有比较在smp系统上在不同核心或CPU上获得的时间的风险。

我可以做得更好吗？

以下是我的测试用例：

inline void rdtscl(unsigned long long *t)
{
    unsigned long long l, h;
    __asm__ __volatile__ ("rdtsc" : "=a"(l), "=d"(h));
    *t = ( (unsigned long long)l)|( ((unsigned long long)h) <<32 );
}

这是我从here获得的rdtsc。

{{1}}

Answer 1

我创建了一个单独的线程，每隔1 ms更新一次boost :: atomic。

我的执行线程为时间戳读取了这么久。

更好的吞吐量。

用于计算时间增量的最快的linux C时间函数是什么？使用clock_gettime和gettimeofday看到性能不佳

1 个答案: