Question

在我将一些遗留代码从win32移植到win64后，我讨论了删除警告的最佳策略＆＃34;可能丢失数据＆＃34; （cell2mat）。我即将在我的代码中将unsigned int替换为size_t。

但是，我的代码在性能方面至关重要（我甚至无法在Debug中运行它......太慢了）。

我做了一个快速的基准测试：

#include "stdafx.h"

#include <iostream>
#include <chrono>
#include <string>

template<typename T> void testSpeed()
{
    auto start = std::chrono::steady_clock::now();

    T big = 0;
    for ( T i = 0; i != 100000000; ++i )
        big *= std::rand();

    std::cout << "Elapsed " << std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::steady_clock::now() - start).count() << "ms" << std::endl;
}

int main()
{
    testSpeed<size_t>();
    testSpeed<unsigned int>();

    std::string str;
    std::getline( std::cin, str ); // pause

    return 0;
}

编译为x64，输出：

Elapsed 2185ms
Elapsed 2157ms

编译为x86，输出：

Elapsed 2756ms
Elapsed 2748ms

因此显然使用size_t代替unsigned int对性能影响不显着。但情况确实如此（以这种方式很难对表现进行基准测试）。

将unsigned int更改为size_t是否会影响CPU性能（现在将处理64位对象而不是32位）？

Answer 1

绝对不是。在现代（甚至更旧）的CPU上，64位整数运算的速度与32位运算一样快。

我的i7 4600u算术运算示例a * b / c：

(int32_t) * (int32_t) / (int32_t)：1.3 nsec
(int64_t) * (int64_t) / (int64_t)：1.3纳秒

两个测试都编译为x64目标（与你的目标相同）。

Howether，如果你的代码管理大整数的整个对象（大整数数组，例如fox），使用size_t代替unsigned int可能会影响性能，如果缓存未命中数增加（更大）数据可能超过缓存容量）。检查对性能影响的最可靠方法是在两种情况下测试您的应用程序。使用您自己的类型typedef'ed size_t或unsigned int，然后对您的应用程序进行基准测试。

Answer 2

至少在Intel上，如果没有数据依赖性，ALU可以并行执行两个32位操作。如果size_t是64位，则只能执行一次操作。

在您的示例中，没有区别，因为您有一个数据dep（big取决于它自己。）

您可以在代码中看到差异，例如：

uint32_t a = std::rand();
uint32_t b = std::rand();
const uint32_t randVal = std::rand();

for (int i = 0; i < 10000000; ++i) {
    a += randVal;
    b += randVal;
}

如果您将a和b切换为uint64_t，则该循环可能会运行得更慢，因为一次只能执行一项操作。

请注意，ALU不能与16位整数并行执行4次操作，或者8位与8位并行。对于32位数据，它只有2个操作，对于64位数据，它只有1个操作。

注意：这不是您在生成的机器代码中可以看到的内容。这种并行化发生在CPU中。

编辑：另外，请参阅Andrei Alexandrescu在他提到的这个演讲中的this part。

可以将unsigned int更改为size_t会影响性能吗？

2 个答案: