Question

更新：如果代码：

，我只是设法打败了我自己的32

void test(char *file_char, unsigned int size)
{
    char* file_ = file_char;
    char* size_x = file_char+size;
    char to_find = 0;

    for(unsigned int i = 0; i < 10000; i++)
    {
        file_char = file_;

        while(*file_char++ != to_find);//skip all characters till we find a 0

        if(*file_char)//some if in order to avoid compiler removing our test code
            cout << "found";
    }
}

上面的代码要求0在数组中至少出现一次，否则会出现错误但是它比if代码快一点，而且更紧凑。

有没有办法让上面的代码更快？（有一个char数组并试图找到一个char出现的位置）？

我写了一些代码，我真的很困惑。

INIT：

int main()
{
    FILE *file;
    file = fopen("C:\\data.txt", "rb");

    static const int size = 60000;

    char *file_char = (char*)malloc(size);

    unsigned int i = 0;
    while(i < size)
        fread(&file_char[i++], 1, 1, file);

    clock_t clock_ = clock();
    test(file_char, size);
    std::cout << ((double)clock()-clock_)/1000;
    return 0;
}

下面的代码需要3.5秒才能执行：

void test(char *file_char, unsigned int size)
{
    for(unsigned int i = 0; i < 100000; i++)
    {
        unsigned int pos = 0;
        char to_find = 0;
        while(pos < size)
            if(file_char[pos++] == to_find)
                std::cout << "found";
    }
}

但是下面的代码需要1.8秒，HALF时间！

void test(char *file_char, unsigned int size)
{
    for(unsigned int i = 0; i < 100000; i++)
    {
        unsigned int pos = 0;
        char to_find = 0;
        while(pos < size)
        {
            if(file_char[pos] == to_find)
                std::cout << "found";
            else if(file_char[pos+1] == to_find)
                std::cout << "found";
            else if(file_char[pos+2] == to_find)
                std::cout << "found";
            else if(file_char[pos+3] == to_find)
                std::cout << "found";
            else if(file_char[pos+4] == to_find)
                std::cout << "found";
            else if(file_char[pos+5] == to_find)
                std::cout << "found";
            else if(file_char[pos+6] == to_find)
                std::cout << "found";
            else if(file_char[pos+7] == to_find)
                std::cout << "found";
            else if(file_char[pos+8] == to_find)
                std::cout << "found";
            else if(file_char[pos+9] == to_find)
                std::cout << "found";
            else if(file_char[pos+10] == to_find)
                std::cout << "found";
            else if(file_char[pos+11] == to_find)
                std::cout << "found";
            else if(file_char[pos+12] == to_find)
                std::cout << "found";
            else if(file_char[pos+13] == to_find)
                std::cout << "found";
            else if(file_char[pos+14] == to_find)
                std::cout << "found";
            else if(file_char[pos+15] == to_find)
                std::cout << "found";
            else if(file_char[pos+16] == to_find)
                std::cout << "found";
            else if(file_char[pos+17] == to_find)
                std::cout << "found";
            else if(file_char[pos+18] == to_find)
                std::cout << "found";
            else if(file_char[pos+19] == to_find)
                std::cout << "found";
            else if(file_char[pos+20] == to_find)
                std::cout << "found";
            else if(file_char[pos+21] == to_find)
                std::cout << "found";
            else if(file_char[pos+22] == to_find)
                std::cout << "found";
            else if(file_char[pos+23] == to_find)
                std::cout << "found";
            else if(file_char[pos+24] == to_find)
                std::cout << "found";
            else if(file_char[pos+25] == to_find)
                std::cout << "found";
            else if(file_char[pos+26] == to_find)
                std::cout << "found";
            else if(file_char[pos+27] == to_find)
                std::cout << "found";
            else if(file_char[pos+28] == to_find)
                std::cout << "found";
            else if(file_char[pos+29] == to_find)
                std::cout << "found";
            else if(file_char[pos+30] == to_find)
                std::cout << "found";
            else if(file_char[pos+31] == to_find)
                std::cout << "found";

            pos+=32;
        }
    }
}

我正在使用Visual Studio 2012 x64并且该程序从不会cout任何东西，因为没有char是0.如何解释？如何在不使用32 ifs的情况下归档相同的性能？

编辑1：如果我创建了64个ifs，则32 ifs版本的速度没有增加。

编辑2 ：如果我删除else并保留ifs程序需要4秒。

现在，如何解释上述不合理的结果？

Answer 1

你的循环基本上由两个比较组成：pos < size和file_char[pos] == to_find。通过展开循环，您可以将比较次数从2 *大小减少到（大小+大小/ 32）。

Answer 2

我认为这两个代码是不同的。

在第一个中，每次检查'if'比较。

在第二个中，如果第一个是好的，你跳过以下所有的！（因为其他）所以你节省了很多比较（但缺少支票）。

要获得相同的代码，您必须删除所有“其他”。

Answer 3

为了确定，我做了一些测试。

使用g ++（在Linux和Windows下）我得到的结果与Visual Studio相同：

版本1 （没有显式循环展开）

g++ -O3 7.5s

第2版（显式循环展开）

g++ -O3 2.1s

但是打开了 -funroll-loops 选项（默认情况下通常不会启用此优化，因为它可能会或可能不会让它运行得更快）：

版本1 （没有显式循环展开）

g++ -O3 -funroll-loops 2.2s

第2版（显式循环展开）

g++ -O3 -funroll-loops 2.2s

所以这与循环展开有关。

修改

您可以更改上一个示例以显式插入哨兵，例如：

int main()
{
  static const int size = 60000;

  char *file_char = (char*)malloc(size+1);  // The last element is the sentry

  // ...Fill file_char[]...

  file_char[size] = 0;  // the sentry

  // ...
}

所以test函数不会失败（当然你必须检查你是否找到了哨兵或“好”零，但它只是一个）。

第3版（哨兵）

g++ -O3 0.68s

g++ -O3 -funroll-loops 0.72s

Answer 4

在你的第二个例子中，一旦完成比赛，它会跳过剩下的比较...... 如果你可以保证每个32个索引只有一个to_find，那么这是可行的...但你也可以重写（可能有1个错误）：

void test(char *file_char, unsigned int size)
{
    for(unsigned int i = 0; i < 100000; i++)
    {
        unsigned int pos = 0;
        char to_find = 0;
        int skip = 32;
        while(pos < size)
        {
            if(file_char[pos++] == to_find)
            {
                std::cout << "found";
                pos+=skip;
            }
            skip--;
            if (!skip)
            {skip = 32;}
        }
    }
}

Answer 5

这是一种优化技术，通常由一些称为循环展开优化的优化编译器应用。在循环的第一个代码中，必须运行10,000次迭代，而在第二个代码中，迭代次数减少到最小值（10,000 / 32）。在多次迭代的过程中，for循环的结束被编译为循环开始的跳转指令，（无条件跳转在机器代码中非常昂贵，因为它可能导致刷新CPU中的指令缓冲区）被执行更不常见的是循环测试和循环计数器更新指令。在相当多的迭代中，这代表了执行时间的显着改进。尽管循环中其他测试的数量显着增加，但它们将被编译成类似于的跳转表：

if（condition1）goto found

if（condition2）goto found

...

找到：

显着提高了性能。

低级C / C ++性能？

5 个答案: