：423,234,439

Question

我注意到如果我使用cout打印出一个长字符串（char *），它似乎一次在Windows 7，Vista和Linux（使用putty）的屏幕上使用Visual C ++ 2008在Windows和G ++上打印1个字符在Linux上。 Printf是如此快得多，我实际上从cout切换到printf，用于我的项目中的大多数打印。这让我感到困惑，因为这question让我觉得我是唯一一个有这个问题的人。

我甚至写了一个cout替换，看起来像是在我的comp上击败cout的裤子 -

class rcout
{
public:
    char buff[4096];
    unsigned int size;
    unsigned int length;

    rcout()
    {
        size = 4096;
        length = 0;
        buff[0] = '\0';
    }

    ~rcout()
    {
        printf("%s", buff);
    }

    rcout &operator<<(char *b)
    {
        strncpy(buff+length, b, size-length);
        unsigned int i = strlen(b);
        if(i+length >= size)
        {
            buff[size-1] = '\0';
            printf("%s", buff);
            b += (size-length) -1;
            length = 0;
            return (*this) << b;
        }
        else
            length += i;
        return (*this);
    }

    rcout &operator<<(int i)
    {
        char b[32];
        _itoa_s(i, b, 10);
        return (*this)<<b;
    }

    rcout &operator<<(float f)
    {
        char b[32];
        sprintf_s(b, 32, "%f", f);
        return (*this)<<b;
    }
};

int main()
{
    char buff[65536];
    memset(buff, 0, 65536);

    for(int i=0;i<3000;i++)
        buff[i] = rand()%26 + 'A';

    rcout() << buff << buff <<"\n---"<< 121 <<"---" << 1.21f <<"---\n";
    Sleep(1000);
    cout << "\n\nOk, now cout....\n\n";
    cout << buff << buff <<"\n---"<< 121 <<"---" << 1.21f <<"---\n";
    Sleep(1000);
    cout << "\n\nOk, now me again....\n\n";
    rcout() << buff << buff <<"\n---"<< 121 <<"---" << 1.21f <<"---\n";
    Sleep(1000);

    return 0;
}

为什么cout为我打印这么慢的想法？

Answer 1

注意：此实验结果对MSVC有效。在库的其他一些实现中，结果会有所不同。

printf 可能比cout快得多。虽然printf在运行时解析格式字符串，但与cout相比，它需要的函数调用要少得多，实际上需要少量指令来完成相同的工作。以下是我的实验总结：

静态指令的数量

通常，cout会生成大量代码而不是printf。假设我们有以下cout代码以某些格式打印出来。

os << setw(width) << dec << "0x" << hex << addr << ": " << rtnname <<
  ": " << srccode << "(" << dec << lineno << ")" << endl;

在具有优化的VC ++编译器上，它生成 188 字节代码。但是，当您替换基于printf的代码时，只需 42 字节。

动态执行指令的数量

静态指令的数量只是告诉静态二进制代码的区别。更重要的是在运行时动态执行的实际指令数。我也做了一个简单的实验：

测试代码：

int a = 1999;
char b = 'a';
unsigned int c = 4200000000;
long long int d = 987654321098765;
long long unsigned int e = 1234567890123456789;
float f = 3123.4578f;
double g = 3.141592654;

void Test1()
{
    cout 
        << "a:" << a << “\n”
        << "a:" << setfill('0') << setw(8) << a << “\n”
        << "b:" << b << “\n”
        << "c:" << c << “\n”
        << "d:" << d << “\n”
        << "e:" << e << “\n”
        << "f:" << setprecision(6) << f << “\n”
        << "g:" << setprecision(10) << g << endl;
}

void Test2()
{
    fprintf(stdout,
        "a:%d\n"
        "a:%08d\n"
        "b:%c\n"
        "c:%u\n"
        "d:%I64d\n"
        "e:%I64u\n"
        "f:%.2f\n"
        "g:%.9lf\n",
        a, a, b, c, d, e, f, g);
    fflush(stdout);
}

int main()
{
    DWORD A, B;
    DWORD start = GetTickCount();
    for (int i = 0; i < 10000; ++i)
        Test1();
    A = GetTickCount() - start;

    start = GetTickCount();
    for (int i = 0; i < 10000; ++i)
        Test2();
    B = GetTickCount() - start;

    cerr << A << endl;
    cerr << B << endl;
    return 0;
}

以下是Test1（cout）的结果：

：423,234,439
内存加载/存储：约。 320,000和980,000
经过时间： 52秒

那么，printf呢？这是Test2的结果：

：164,800,800
内存加载/存储：约。 70,000和180,000
已用时间： 13秒

在这台机器和编译器中，printf要快得多cout。在两个执行指令中，加载/存储数（表示缓存未命中数）有3~4倍差异。

我知道这是一个极端的例子。另外，我应该注意，当您处理32/64位数据并且需要32/64平台独立性时，cout会容易得多。总是有权衡。我在检查类型时使用cout非常棘手。

好的，MSVS中的cout很糟糕：）

Answer 2

我建议您在另一台计算机上尝试相同的测试。对于为什么会发生这种情况，我没有一个好的答案;我只能说我从来没有注意到cout和printf之间的速度差异。我还在Linux上使用gcc 4.3.2测试了你的代码，没有任何区别。

话虽如此，你不能轻易地用自己的实现替换cout。事实上，cout是std :: ostream的一个实例，它内置了很多功能，这对于与其他重载iostream操作符的类的互操作性是必需的。

修改

任何说printf总是比std::cout快的人只是错误。我刚刚运行了minjang发布的测试代码，gcc 4.3.2和-O2在64位AMD Athlon X2上标记，而cout实际上更快。

我得到了以下结果：

printf: 00:00:12.024 cout: 00:00:04.144

cout总是比printf快吗？可能不是。尤其不适用于较旧的实现。但是在较新的实现上，iostream可能比stdio更快，因为编译器在编译时不知道需要调用哪些函数才能将整数/浮点数/对象转换为字符串。

但更重要的是，printf与cout 的速度取决于实现，因此OP描述的问题不易解释。

Answer 3

在使用std :: cout / cin之前尝试调用ios::sync_with_stdio(false);，除非您在程序中混合使用stdio和iostream，这是一件坏事。

Answer 4

根据我在编程竞赛中的经验，printf比cout更快。

我记得很多次，因为cin / cout而我的解决方案没有在时间限制之前完成，而printf / scanf确实有效。< / p>

除此之外，cout慢于printf似乎正常（至少对我而言），因为它会执行更多操作。

Answer 5

尝试使用一些endl或flush es，因为它们将刷新cout的缓冲区，以防操作系统因任何原因缓存程序的输出。但是，正如查尔斯所说，这种行为没有很好的解释，所以如果这没有帮助那么它可能是你的机器特有的问题。

Answer 6

您应首先尝试将所有数据写入ostringstream，然后在cout的{{1}}上使用ostringstream。我在64位Windows 7上，str()已经明显快于Test1（您的里程可能会有所不同）。首先使用Test2构建单个字符串，然后在该上使用ostringstream进一步减少 cout的执行时间约3到4倍。请务必Test1。

即，替换

#include <sstream>

使用：

void Test1()
{
    cout
        << "a:" << a << "\n"
        << "a:" << setfill('0') << setw(8) << a << "\n"
        << "b:" << b << "\n"
        << "c:" << c << "\n"
        << "d:" << d << "\n"
        << "e:" << e << "\n"
        << "f:" << setprecision(6) << f << "\n"
        << "g:" << setprecision(10) << g << endl;
}

我怀疑void Test1() { ostringstream oss; oss << "a:" << a << "\n" << "a:" << setfill('0') << setw(8) << a << "\n" << "b:" << b << "\n" << "c:" << c << "\n" << "d:" << d << "\n" << "e:" << e << "\n" << "f:" << setprecision(6) << f << "\n" << "g:" << setprecision(10) << g << endl; cout << oss.str(); }每次在ostringstream上调用operator<<时都不会尝试写入屏幕，因此速度会快得多。我也注意到，通过经验减少你写入屏幕的次数（通过一次写更多）可以提高性能（同样，你的里程可能会有所不同）。

如，

cout

就我而言，void Foo1() { for(int i = 0; i < 10000; ++i) { cout << "Foo1\n"; } } void Foo2() { std::string s; for(int i = 0; i < 10000; ++i) { s += "Foo2\n"; } cout << s; } void Foo3() { std::ostringstream oss; for(int i = 0; i < 10000; ++i) { oss << "Foo3\n"; } cout << oss.str(); }花了1,092毫秒，Foo1花了234毫秒，Foo2花了218毫秒。 Foo3是你的朋友。显然，Foo2和Foo3需要（通常）更多的内存。要将它与C风格的函数进行比较，请尝试将ostingstream放入缓冲区，然后使用sprintf编写该缓冲区，您应该看到效率高于fprintf（尽管对我来说这只是改进了Test2的效果约为10％左右; Test2和cout确实是不同的野兽。

编译器：MinGW64（TDM及其捆绑的库）。

Answer 7

尝试使用ios::sync_with_stdio(false);。在使用std :: cin / cout之前提及它。它不会混合stdio或iostream，但它会将iostream标准流与其相应的标准c流同步。例如 - iostream的std :: cin / wcin与c stream的stdin同步

Answer 8

这是hax应该使c ++流与c printf一样快。我从未测试过，但我相信它有效。

ios_base::sync_with_stdio(0);

C ++ cout打印很慢

8 个答案:

：423,234,439

内存加载/存储：约。 320,000和980,000

：164,800,800

内存加载/存储：约。 70,000和180,000