我正在尝试在SSAO之后实施OGLDev Tutorial 45,Tutorial by John Chapman基于updated fiddle。 OGLDev教程使用高度简化的方法,对片段位置周围半径中的随机点进行采样,并根据有多少采样点的深度大于存储在该位置的实际表面深度来增加AO因子(位置越多)片段周围的片段位于其前方,遮挡力越大。
我使用的'引擎'没有像OGLDev那样的模块化延迟着色,但基本上它首先将整个屏幕颜色呈现给具有纹理附件和深度渲染缓冲附件的帧缓冲。为了比较深度,片段视图空间位置被渲染到具有纹理附件的另一个帧缓冲区。 然后,这些纹理由SSAO着色器进行后处理,并将结果绘制到填充四边形的屏幕上。 两个纹理都自己绘制到四边形,着色器输入制服似乎也没问题,所以这就是为什么我没有包含任何引擎代码。
Fragment Shader几乎完全相同,如下所示。我已经提供了一些有助于我个人理解的评论。
#version 330 core
in vec2 texCoord;
layout(location = 0) out vec4 outColor;
const int RANDOM_VECTOR_ARRAY_MAX_SIZE = 128; // reference uses 64
const float SAMPLE_RADIUS = 1.5f; // TODO: play with this value, reference uses 1.5
uniform sampler2D screenColorTexture; // the whole rendered screen
uniform sampler2D viewPosTexture; // interpolated vertex positions in view space
uniform mat4 projMat;
// we use a uniform buffer object for better performance
layout (std140) uniform RandomVectors
{
vec3 randomVectors[RANDOM_VECTOR_ARRAY_MAX_SIZE];
};
void main()
{
vec4 screenColor = texture(screenColorTexture, texCoord).rgba;
vec3 viewPos = texture(viewPosTexture, texCoord).xyz;
float AO = 0.0;
// sample random points to compare depths around the view space position.
// the more sampled points lie in front of the actual depth at the sampled position,
// the higher the probability of the surface point to be occluded.
for (int i = 0; i < RANDOM_VECTOR_ARRAY_MAX_SIZE; ++i) {
// take a random sample point.
vec3 samplePos = viewPos + randomVectors[i];
// project sample point onto near clipping plane
// to find the depth value (i.e. actual surface geometry)
// at the given view space position for which to compare depth
vec4 offset = vec4(samplePos, 1.0);
offset = projMat * offset; // project onto near clipping plane
offset.xy /= offset.w; // perform perspective divide
offset.xy = offset.xy * 0.5 + vec2(0.5); // transform to [0,1] range
float sampleActualSurfaceDepth = texture(viewPosTexture, offset.xy).z;
// compare depth of random sampled point to actual depth at sampled xy position:
// the function step(edge, value) returns 1 if value > edge, else 0
// thus if the random sampled point's depth is greater (lies behind) of the actual surface depth at that point,
// the probability of occlusion increases.
// note: if the actual depth at the sampled position is too far off from the depth at the fragment position,
// i.e. the surface has a sharp ridge/crevice, it doesnt add to the occlusion, to avoid artifacts.
if (abs(viewPos.z - sampleActualSurfaceDepth) < SAMPLE_RADIUS) {
AO += step(sampleActualSurfaceDepth, samplePos.z);
}
}
// normalize the ratio of sampled points lying behind the surface to a probability in [0,1]
// the occlusion factor should make the color darker, not lighter, so we invert it.
AO = 1.0 - AO / float(RANDOM_VECTOR_ARRAY_MAX_SIZE);
///
outColor = screenColor + mix(vec4(0.2), vec4(pow(AO, 2.0)), 1.0);
/*/
outColor = vec4(viewPos, 1); // DEBUG: draw view space positions
//*/
}
vec2 texCoord = gl_FragCoord.xy / textureSize(screenColorTexture, 0);
当我将片段着色器底部的AO混合因子设置为0时,它会平滑地运行到fps上限(即使仍然执行计算,至少我猜编译器不会优化:D)。但是当AO混合在一起时,每帧绘制需要多达80毫秒(随着时间的推移越来越慢,好像缓冲区正在填满),结果真的很有趣并且令人困惑:
显然,映射看起来很遥远,闪烁的噪声似乎非常随机,好像它直接对应于随机样本向量。 我发现最有趣的是,绘制时间仅在添加AO因子时大量增加,而不是由于遮挡计算。抽屉缓冲区有问题吗?
答案 0 :(得分:0)
问题似乎与所选的纹理类型有关。
需要将句柄viewPosTexture
的纹理明确定义为浮点纹理格式GL_RGB16F
或GL_RGBA32F
,而不仅仅是GL_RGB
。有趣的是,单独的纹理被绘制得很好,这些问题只能在组合中出现。
// generate screen color texture
// note: GL_NEAREST interpolation is ok since there is no subpixel sampling anyway
glGenTextures(1, &screenColorTexture);
glBindTexture(GL_TEXTURE_2D, screenColorTexture);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGB, windowWidth, windowHeight, 0, GL_BGR, GL_UNSIGNED_BYTE, NULL);
// generate depth renderbuffer. without this, depth testing wont work.
// we use a renderbuffer since we wont have to sample this, opengl uses it directly.
glGenRenderbuffers(1, &screenDepthBuffer);
glBindRenderbuffer(GL_RENDERBUFFER, screenDepthBuffer);
glRenderbufferStorage(GL_RENDERBUFFER, GL_DEPTH_COMPONENT, windowWidth, windowHeight);
// generate vertex view space position texture
glGenTextures(1, &viewPosTexture);
glBindTexture(GL_TEXTURE_2D, viewPosTexture);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA32F, windowWidth, windowHeight, 0, GL_BGRA, GL_UNSIGNED_BYTE, NULL);
慢速绘图可能是由GLSL mix function引起的。将进一步调查。
闪烁是由于每帧中新的随机向量的再生和传递。只需传递足够的随机向量就可以解决问题。否则,它可能有助于模糊SSAO结果。
基本上,SSAO现在可以使用了!现在它只是或多或少明显的错误。