请考虑以下代码:LWS
#include <iostream>
#include <chrono>
#include <cmath>
#include <ctime>
#include <cstdlib>
template <class Counter, class Function, class... Args>
inline double benchmark(const Counter& counter, Function&& f, Args&&... args)
{
const std::chrono::high_resolution_clock::time_point marker
= std::chrono::high_resolution_clock::now();
for (Counter i = Counter(); i < counter; ++i) {
f(args...);
}
return std::chrono::duration_cast<std::chrono::duration<double> >
(std::chrono::high_resolution_clock::now()-marker).count();
}
int main(int argc, char* argv[])
{
srand(time(nullptr));
double y = rand()%10+1;
std::cout<<benchmark(1000000, [](double x){return std::sin(x);}, y)<<"\n";
return 0;
}
函数benchmark
测量函数的执行时间。问题是在优化过程中函数被删除为空语句。有没有办法强制该函数真正执行?
编辑:
1)我正在寻找标准C ++中的解决方案(没有特定于编译器的指令)
2)如果f
尽可能保持通用(例如void返回类型)
答案 0 :(得分:3)
我发现这个解决方案使用的是volatile:
#include <iostream>
#include <chrono>
#include <cmath>
template <class Clock = std::chrono::high_resolution_clock, class Counter, class Function, class... Args>
inline double benchmark(const Counter& counter, Function&& f, Args&&... args)
{
volatile decltype(f(args...)) temporary = decltype(f(args...))();
const typename Clock::time_point marker = Clock::now();
for (Counter i = Counter(); i < counter; ++i) {
temporary = f(args...);
}
return std::chrono::duration<double>(Clock::now()-marker).count();
}
int main(int argc, char* argv[])
{
std::cout<<benchmark(1000000000, [](double x){return std::sin(x);}, 3.)<<"\n";
return 0;
}
如果您知道如何改进此代码,请发表评论。
答案 1 :(得分:0)
由于(匿名)函数返回一个值,为什么不在benchmark
中捕获该值并对其执行一些微不足道的操作,比如将它添加到通过引用传入的值?像这样:
template <class Counter, class Function, class... Args>
inline double benchmark(double& sum, const Counter& counter, Function&& f, Args&&... args)
{
const std::chrono::high_resolution_clock::time_point marker
= std::chrono::high_resolution_clock::now();
for (Counter i = Counter(); i < counter; ++i) {
sum += f(args...);
}
return std::chrono::duration_cast<std::chrono::duration<double> >
(std::chrono::high_resolution_clock::now()-marker).count();
}
我认为编译器现在很难优化函数调用(假设您打印或以某种方式使用main()
中的和)。