我试图计算编译器生成的一些低级汇编代码的性能。然而,在某些时刻,我得到了一些我不理解的奇怪结果。所以这是我的代码:
#include <iostream>
#include <chrono>
#include <vector>
//---------------------------------------------------------------------------------------
int main()
{
std::srand(time(NULL));
constexpr unsigned vectorSize = 1000u;
constexpr unsigned loopCount = 1000000u;
std::vector<int> vec1(vectorSize);
std::vector<int> vec2(vectorSize);
for (unsigned i = 0u; i < vectorSize; ++i)
{
vec1[i] = std::rand();
}
for (unsigned i = 0u; i < vectorSize; ++i)
{
vec2[i] = std::rand();
}
std::chrono::time_point<std::chrono::high_resolution_clock> start;
std::chrono::time_point<std::chrono::high_resolution_clock> end;
long long ms;
//---------------------------------------------------------------------------------------
start = std::chrono::high_resolution_clock::now();
for (unsigned j = 0u; j < loopCount; ++j)
{
for (unsigned i = 0u; i < vec2.size(); ++i)
{
vec2[i] = vec1[i];
}
}
end = std::chrono::high_resolution_clock::now();
ms = std::chrono::duration_cast<std::chrono::milliseconds>(end - start).count();
std::cout << "Evaluation took: " << ms << " ms" << std::endl;
//---------------------------------------------------------------------------------------
start = std::chrono::high_resolution_clock::now();
for (unsigned j = 0u; j < loopCount; ++j)
{
for (unsigned i = 0u; i < vec2.size(); ++i)
{
vec2[i] = vec1[i];
}
}
end = std::chrono::high_resolution_clock::now();
ms = std::chrono::duration_cast<std::chrono::milliseconds>(end - start).count();
std::cout << "Evaluation took: " << ms << " ms" << std::endl;
//---------------------------------------------------------------------------------------
start = std::chrono::high_resolution_clock::now();
for (unsigned j = 0u; j < loopCount; ++j)
{
for (unsigned i = 0u; i < vec2.size(); ++i)
{
vec2[i] = vec1[i];
}
}
end = std::chrono::high_resolution_clock::now();
ms = std::chrono::duration_cast<std::chrono::milliseconds>(end - start).count();
std::cout << "Evaluation took: " << ms << " ms" << std::endl;
//---------------------------------------------------------------------------------------
start = std::chrono::high_resolution_clock::now();
for (unsigned j = 0u; j < loopCount; ++j)
{
for (unsigned i = 0u; i < vec2.size(); ++i)
{
vec2[i] = vec1[i];
}
}
end = std::chrono::high_resolution_clock::now();
ms = std::chrono::duration_cast<std::chrono::milliseconds>(end - start).count();
std::cout << "Evaluation took: " << ms << " ms" << std::endl;
//---------------------------------------------------------------------------------------
start = std::chrono::high_resolution_clock::now();
for (unsigned j = 0u; j < loopCount; ++j)
{
for (unsigned i = 0u; i < vec2.size(); ++i)
{
vec2[i] = vec1[i];
}
}
end = std::chrono::high_resolution_clock::now();
ms = std::chrono::duration_cast<std::chrono::milliseconds>(end - start).count();
std::cout << "Evaluation took: " << ms << " ms" << std::endl;
//---------------------------------------------------------------------------------------
std::cout << "Press enter to exit..." << std::endl;
std::cin.get();
return 0;
}
主要&#34;工作&#34;在这个程序中,在循环中完成,其中向量vec1的元素被简单地分配给向量vec2。内部for循环条件旨在不允许编译器进行SIMD优化。我通过复制代码重复了5次代码,所以我希望每个拷贝都能得到或多或少相同的时间测量。但结果却不同:
评估时间:425毫秒
评估时间:694毫秒
评估耗时:462毫秒
评估耗时:441毫秒
评估耗时:710毫秒
按enter退出...
在发布模式下使用Visual Studio 2015进行构建后的情况。令人惊讶的是,GCC的计算时间非常相似,但是使用Visual我每次都会得到这种奇怪的模式而且它总是一样的。
所以我想问一下我的代码是否以某种方式被破坏了,我错过了什么?或许这只是编译器奇怪的行为? 在任何类型的基准测试中获得正确的时间测量非常重要,所以我不想从一开始就犯错误。