Question

为了自学OpenGL，我正在通过5th edition of the Superbible工作。

我目前正试图弄清楚如何结合HDR和MSAA（如第9章所述）。

对于HDR，本书提出了一种自适应色调映射方法，该方法基于计算每个片段的5x5卷积滤波器的平均亮度。

对于MSAA，使用的方法按照从样本距离计算的权重对所有样本进行平均。

我尝试将两者结合起来，在下面的pastebin中，将色调映射应用于每个样本，然后对它们进行平均以计算最终的片段颜色。

性能（正如人们应该预期的那样？）可怕：每个样本25次查找，4xMSAA次数为4次，我猜测GPU花费大部分时间来查找我的FBO纹理。切换到代码中use_HDR统一控制的代码路径会使性能从400 + fps下降到10以下，用于简单的场景。

我的问题有两个：

这是一种执行色调映射的理智方法吗？如果没有，你会建议什么？
如何组合MSAA和基于卷积的过滤器？我猜我会再次遇到这个问题需要查找相邻纹素的任何过滤器，即几乎像绽放，模糊等等？

代码：

#version 330
in Data
{
    vec4 position;
    vec4 normal;
    vec4 color;
    vec2 texCoord;
    mat4 mvp;
    mat4 mv;
} gdata;

out vec4 outputColor;
uniform sampler2DMS tex;
uniform sampler1D lum_to_exposure;
uniform samplerBuffer weights;
uniform int samplecount;
uniform bool use_HDR;

vec4 tone_map(vec4 color, float exp)
{
    return 1.0f - exp2(-color * exp);
}

const ivec2 tc_offset[25] = ivec2[](ivec2(-2, -2), ivec2(-1, -2), ivec2(0, -2), ivec2(1, -2), ivec2(2, -2),
                                    ivec2(-2, -1), ivec2(-1, -1), ivec2(0, -1), ivec2(1, -1), ivec2(2, -1),
                                    ivec2(-2,  0), ivec2(-1,  0), ivec2(0,  0), ivec2(1,  0), ivec2(2,  0),
                                    ivec2(-2,  1), ivec2(-1,  1), ivec2(0,  1), ivec2(1,  1), ivec2(2,  1),
                                    ivec2(-2,  2), ivec2(-1,  2), ivec2(0,  2), ivec2(1,  2), ivec2(2,  2));

void main()
{
    ivec2 itexcoords = ivec2(floor(textureSize(tex) * gdata.texCoord));
    float tex_size_x = textureSize(tex).x;
    float tex_size_y = textureSize(tex).y;
    outputColor = vec4(0.0f, 0.0f, 0.0f, 1.0f);
    // for each sample in the multi sample buffer...
    for (int i = 0; i < samplecount; i++)
    {
        // ... calculate exposure based on the corresponding sample of nearby texels
        vec4 sample;
        if (use_HDR)
        {
            sample = texelFetch(tex, itexcoords, i);

            // look up a 5x5 area around the current texel
            vec4 hdr_samples[25];
            for (int j = 0; j < 25; ++j)
            {
                ivec2 coords = clamp(itexcoords + tc_offset[j], ivec2(0, 0), ivec2(tex_size_x, tex_size_y));
                hdr_samples[j] = texelFetch(tex, coords, i);
            }
            // average the surrounding texels
            vec4 area_color = (
                     ( 1.0f * (hdr_samples[0] + hdr_samples[4] + hdr_samples[20] + hdr_samples[24])) +
                     ( 4.0f * (hdr_samples[1] + hdr_samples[3] + hdr_samples[5] + hdr_samples[9]
                             + hdr_samples[15] + hdr_samples[19] + hdr_samples[21] + hdr_samples[23])) +
                     ( 7.0f * (hdr_samples[2] + hdr_samples[10] + hdr_samples[14] + hdr_samples[22])) +
                     (16.0f * (hdr_samples[6] + hdr_samples[8] + hdr_samples[16] + hdr_samples[18])) +
                     (26.0f * (hdr_samples[7] + hdr_samples[11] + hdr_samples[13] + hdr_samples[17])) +
                     (41.0f * (hdr_samples[12]))
                     ) / 273.0f;
            // RGB to luminance formula : lum = 0.3R + 0.59G + 0.11B
            float area_luminance = dot(area_color.rgb, vec3(0.3, 0.59, 0.11));
            float exposure = texture(lum_to_exposure, area_luminance/2.0).r;
            exposure = clamp(exposure, 0.02f, 20.0f);


            sample = tone_map(sample, exposure);
        }
        else
            sample = texelFetch(tex, itexcoords, i);

        // weight the sample based on its position
        float weight = texelFetch(weights, i).r;
        outputColor += sample * weight;
    }
}

Answer 1

我没有Superbible的副本，所以我不知道他们的确切命题，但这种方法似乎非常低效且不精确：你的5x5过滤器只访问'i'th 每个纹素的样本，完全错过其他样本。

对于过滤阶段，我会像kvark已经建议的那样，使用glBlitFramebuffer在另一个纹理中解析所有样本在HDR中累积。之后，使用separable filter在另一个HDR纹理中进行过滤，可能使用bilinear filtering来获得性能，甚至使用GPU硬件来帮助提高性能，使用Matt Pettineo's recent blog post。

这将为您提供模糊的纹理，然后您可以在色调映射着色器中进行采样。这应该会大大提高性能，但会占用更多内存。

请注意，存在其他色调映射运算符，并且此域中没有“基本事实”。您可以选择使用更高效的方法，而不是使用这种细粒度的光度估计。

您可以查看关于色调映射的before the MSAA resolve，这可以为您提供有关如何改进事物的提示，可能是通过使用glGenerateMipMaps来创建亮度纹理。

关于使用MSAA进行色调映射的具体问题，我唯一知道的是建议对单个样本{{3}}进行色调映射，以防止出现混叠伪像。

Answer 2

就我的GLSL代码而言，像素的所有样本的权重相等。由此我得出结论，代码对每个像素的那些样本的总和感兴趣。总和是平均值乘以样本数。从这里至少可以看出两种优化技术。两者都使用中间单采样纹理，您的代码应该采样而不是原始的多采样纹理：

（准确地对待你正在做的事情）。使用着色器生成中间纹理，该着色器为每个像素写入平均样本。
（快速近似）。让中间纹理只是已解析的原始纹理。可以通过致电glBlitFramebuffer()来有效地完成。这将产生稍微不同的结果（因为样本位置不在网格上），但对于你的任务 - HDR - 它应该无关紧要，因为它几乎都是近似值：）

HDR，自适应色调映射和GLSL中的MSAA

2 个答案: