Question

我必须回答一个相对简单的C代码片段的问题。在下面的功能中，在性能或时间复杂度方面最昂贵的是什么？我真的不知道该如何回答，因为我认为这取决于if语句。另外，我也不知道比较是否昂贵，返回，结构访问和乘法也是如此。

顺便说一句， info_h 是一个结构。

RGBPixel* bm_get_pixel_at(
    unsigned int x,
    unsigned int y,
    RGBPixel *pixel_array )
{
    int index;
    int width_with_padding = info_h.width;
    if( x >= info_h.width || y >= info_h.height ){ return &blackPixel; }
    index = (info_h.height-1 - y) * width_with_padding + x;
    return pixel_array + index;
}

编辑：

好的，所以这个问题可能有点奇怪。我应该补充一点，这只是在稍微复杂一些的c程序中众多功能中的一个，我们现在已经运行了30次oprofile脚本。然后，脚本返回oprofile对每个过程进行采样的平均次数的结果。在该结果表中，该函数的采样率排名第三。因此，后续问题是，此功能的哪一部分导致程序将其大部分时间花费在程序内部？抱歉，一开始是否不清楚

Answer 1

由于您省略了程序的其余部分，因此全部变成了猜测。此函数在您的概要分析结果中显示出很多，可能是由于在内部循环内调用了该函数。该函数的作用（非常明显）是，它返回像素的内存位置，或者，如果请求的索引位于像素数组的边界之外，则返回虚拟像素的内存位置。

如果在循环中运行该函数，则每次迭代都执行边界检查，这当然是多余的。关于优化，这确实是个挂在嘴边的成果：将边界检查放在循环之前，并确保循环本身不会超出边界：

static inline
RGBPixel* bm_get_pixel_at_UNSAFE(
    unsigned int x,
    unsigned int y,
    RGBPixel *pixel_array )
{
    size_t const width_with_padding = info_h.width;
    size_t const index = ((size_t)info_h.height-1 - y) * width_with_padding + x;
    return &pixel_array[index];
}

RGBPixel* bm_get_pixel_at(
    unsigned int x,
    unsigned int y,
    RGBPixel *pixel_array )
{
    return ( x < info_h.width && y < info_h.height ) ?
          bm_get_pixel_at_UNSAFE(x,y, pixel_array)
        : &blackPixel;
}

void foo(RGBPixel *pixel_array)
{
    /* iteration stays inside array bounds */
    for( unsigned int y = 0; y < info_h.height; ++y )
    for( unsigned int x = 0; x < info_h.width;  ++x ){
        RGBPixel *px = bm_get_pixel_at_UNSAFE(x, y, pixel_array);
        /* ... */
    }
}

在此C代码中，哪一部分在性能方面花费最高？

1 个答案: