应用错误收集

gcc分支预测

时间：2012-07-01 14:51:03

标签： c gcc assembly branch-prediction

这是我的演示程序：

#include <stdio.h>
#include <stdlib.h>
#include <time.h>

int cmp(const void *d1, const void *d2)
{
    int a, b;

    a = *(int const *) d1;
    b = *(int const *) d2;

    if (a > b)
        return 1;
    else if (a == b)
        return 0;

    return -1;
}

int main()
{
    int seed = time(NULL);
    srandom(seed);

    int i, n, max = 32768, a[max];

    for (n=0; n < max; n++) {

        int r = random() % 256;
        a[n] = r;

    }

    qsort(a, max, sizeof(int), cmp);

    clock_t beg = clock();

    long long int sum = 0;

    for (i=0; i < 20000; i++) 
    {
        for (n=0; n < max; n++) {
            if (a[n] >= 128)
                sum += a[n];
        }
    }

    clock_t end = clock();

    double sec = (end - beg) / CLOCKS_PER_SEC;

    printf("sec: %f\n", sec);
    printf("sum: %lld\n", sum);

    return 0;
}



unsorted
sec: 5.000000
sum: 63043880000

sorted
sec: 1.000000
sum: 62925420000

这是程序的两个版本的汇编差异，一个版本为qsort，另一个版本没有：

--- unsorted.s  
+++ sorted.s    
@@ -58,7 +58,7 @@
    shrl    $4, %eax
    sall    $4, %eax
    subl    %eax, %esp
-   leal    4(%esp), %eax
+   leal    16(%esp), %eax
    addl    $15, %eax
    shrl    $4, %eax
    sall    $4, %eax
@@ -83,6 +83,13 @@
    movl    -16(%ebp), %eax
    cmpl    -24(%ebp), %eax
    jl  .L7
+   movl    -24(%ebp), %eax
+   movl    $cmp, 12(%esp)
+   movl    $4, 8(%esp)
+   movl    %eax, 4(%esp)
+   movl    -32(%ebp), %eax
+   movl    %eax, (%esp)
+   call    qsort
    movl    $0, -48(%ebp)
    movl    $0, -44(%ebp)
    movl    $0, -12(%ebp)

据我了解程序集输出，由于将值传递给qsort，排序版本只有更多代码，但我没有看到任何分支优化/预测/无论如何。也许我看错了方向？

2 个答案:

答案 0 :(得分：5)

分支预测不是您在汇编代码级别会看到的;它由CPU本身完成。

答案 1 :(得分：0)

Built-in Function：long __builtin_expect (long exp, long c)

您可以使用__builtin_expect为编译器提供分支预测信息。 一般来说，您应该更喜欢使用实际的   正如程序员所做的那样（-fprofile-arcs） 的个人资料反馈   众所周知，在预测他们的计划实际执行情况方面表现不佳。   但是，有些应用程序很难收集这些数据。

返回值是exp的值，它应该是整数表达式。内置的语义是预期的   exp == c。例如：
if (__builtin_expect (x, 0))
  foo ();
表示我们不希望拨打foo，因为我们希望x 零。由于您仅限于exp的积分表达式，所以应该使用诸如
之类的结构
if (__builtin_expect (ptr != NULL, 1))
  foo (*ptr);
测试指针或浮点值时。

否则分支预测由处理器确定......

Branch prediction预测分支目标并启用分支目标   处理器在分支true之前很久就开始执行指令   执行路径是已知的。所有分支都使用分支预测   用于预测的单位（BPU）。该单元不预测目标地址   仅基于分支的EIP而且还基于执行   执行到达此EIP的路径。 BPU可以   有效预测以下分支类型：

•条件分支。

•直接电话和跳跃。

•间接通话和跳转。

•退货。

The microarchitecture tries to overcome this problem by feeding the most probable branch into the pipeline and execut[ing] it speculatively.

...Using various methods of branch prediction