我在处理openGL项目时遇到了一个不寻常的问题。基本上我需要GRAYSCALE单通道格式的帧数据用于一些CV的东西。我使用自定义着色器,FBO和PBO来完成任务。
程序流程如下。
我想确认这个过程正常。我想知道的是,是否可以采取任何措施来提高性能。我将发布一些我正在使用的代码,以便我们都可以关注。
PBO生成器代码
final int[] pbuffers = new int[2];
GLES30.glGenBuffers(2, pbuffers, 0);
for (int i = 0; i < pbuffers.length; i++) {
GLES30.glBindBuffer(GLES30.GL_PIXEL_PACK_BUFFER, pbuffers[i]);
GLES30.glBufferData(GLES30.GL_PIXEL_PACK_BUFFER, width * height, null, GLES30.GL_DYNAMIC_READ);
GLES30.glBindBuffer(GLES30.GL_PIXEL_PACK_BUFFER, 0);
}
pbo_id[PBO_PRIMARY_ID] = pbuffers[0];
pbo_id[PBO_SECONDARY_ID] = pbuffers[1];
列表中的第3步 - &gt;绑定PBO和glReadPixels()
GLES30.glBindBuffer(GLES30.GL_PIXEL_PACK_BUFFER, pbo_id[currentBuffer]);
GLES30.glReadBuffer(GLES30.GL_COLOR_ATTACHMENT0);
JNI.glReadPixels(0, 0, width, height, GL_RED, GL_UNSIGNED_BYTE, 0);
GLES30.glBindBuffer(GLES30.GL_PIXEL_PACK_BUFFER, 0);
final int prevBuffer = previousBuffer;
previousBuffer = currentBuffer;
currentBuffer = prevBuffer;
列表中的第4步 - &gt;绑定前一帧的PBO和glMapBufferRange()。这是从最后一帧开始执行glReadPixels的PBO。
GLES30.glBindBuffer(GLES30.GL_PIXEL_PACK_BUFFER, pbo_id[currentBuffer]);
JNI.glMapBufferRange(GL_PIXEL_PACK_BUFFER, 0, width * height, GL_MAP_READ_BIT);
GLES30.glUnmapBuffer(GLES30.GL_PIXEL_PACK_BUFFER);
GLES30.glBindBuffer(GLES30.GL_PIXEL_PACK_BUFFER, 0);
这就是性能问题的来源。目前,我正在回读480 x 360单通道灰度(从着色器计算)的像素。我已经运行了一些基准测试,结果如下。
40-50ms -> JNI.glReadPixels(0, 0, width, height, GL_RED, GL_UNSIGNED_BYTE, 0);
0-1ms -> JNI.glMapBufferRange(GL_PIXEL_PACK_BUFFER, 0, width * height, GL_MAP_READ_BIT);
根据我的理解,来自PBO的glReadPixels并不意味着阻止呼叫,但无论出于什么原因它在这里阻止它(并且表现远比仅仅从FBO读取更糟糕)。似乎glMapBufferRange按预期运行,并正确返回所需数据。
我唯一能想到的是我使用GL_RED只回读一个频道,但这仍然无法解释为什么glReadPixels会阻塞。
我用于基准测试的设备(一致的行为)。
对此事的任何帮助都将受到高度赞赏!与此同时,我会尝试更多地尝试,看看是否有任何明显的错过。
编辑 - &gt; 2017/03/16(为清晰起见,添加了更多代码)
FBO设置代码
final int[] values = new int[1];
GLES30.glGenTextures(1, values, 0);
GLES30.glBindTexture(GLES30.GL_TEXTURE_2D, values[0]);
// we only want GRAYSCALE / Single channel texture
GLES30.glTexImage2D(GLES30.GL_TEXTURE_2D, 0, GLES30.GL_R8, texWidth, texHeight, 0, GLES30.GL_RED, GLES30.GL_UNSIGNED_BYTE, null);
GLES30.glTexParameteri(GLES30.GL_TEXTURE_2D, GLES30.GL_TEXTURE_WRAP_S, GLES30.GL_CLAMP_TO_EDGE);
GLES30.glTexParameteri(GLES30.GL_TEXTURE_2D, GLES30.GL_TEXTURE_WRAP_T, GLES30.GL_CLAMP_TO_EDGE);
GLES30.glTexParameteri(GLES30.GL_TEXTURE_2D, GLES30.GL_TEXTURE_MIN_FILTER, GLES30.GL_NEAREST);
GLES30.glTexParameteri(GLES30.GL_TEXTURE_2D, GLES30.GL_TEXTURE_MAG_FILTER, GLES30.GL_NEAREST);
this.tex_id[0] = values[0];
GLES30.glGenFramebuffers(1, values, 0);
GLES30.glBindFramebuffer(GLES30.GL_FRAMEBUFFER, values[0]);
this.fbo_id[0] = values[0];
GLES30.glFramebufferTexture2D(GLES30.GL_FRAMEBUFFER, GLES30.GL_COLOR_ATTACHMENT0, GLES30.GL_TEXTURE_2D, this.tex_id[0], 0);
final int status = GLES30.glCheckFramebufferStatus(GLES30.GL_FRAMEBUFFER);
if (status != GLES30.GL_FRAMEBUFFER_COMPLETE) {
Debug.LogError("Framebuffer incomplete. Status: " + status);
}
GLES30.glBindFramebuffer(GLES30.GL_FRAMEBUFFER, 0);
完整的渲染代码。为了清晰起见,我尽可能地解构了逻辑和流程。
// bind the offscreen FBO and render the current camera frame
GLES30.glBindFramebuffer(GLES30.GL_FRAMEBUFFER, dualFBO.getID());
camera.draw(ShaderType.GRAYSCALE);
// ping-pong the FBO ID's
dualFBO.swap();
// dualFBO will now return the ID for last frame
GLES30.glBindFramebuffer(GLES30.GL_FRAMEBUFFER, dualFBO.getID());
// bind the current PB and submit (meant to be async) glReadPixels
GLES30.glBindBuffer(GLES30.GL_PIXEL_PACK_BUFFER, dualPBO.getID());
GLES30.glReadBuffer(GLES30.GL_COLOR_ATTACHMENT0);
// this call locks for 30-50ms... why? (meant to be async???)
JNI.glReadPixels(0, 0, width, height, GL_RED, GL_UNSIGNED_BYTE, 0);
GLES30.glBindBuffer(GLES30.GL_PIXEL_PACK_BUFFER, 0);
// ping-pong the PBO ID's.
dualPBO.swap();
GLES30.glBindBuffer(GLES30.GL_PIXEL_PACK_BUFFER, dualPBO.getID());
// this call is instant
JNI.glMapBufferRange(GL_PIXEL_PACK_BUFFER, 0, width * height, GL_MAP_READ_BIT);
GLES30.glUnmapBuffer(GLES30.GL_PIXEL_PACK_BUFFER);
GLES30.glBindBuffer(GLES30.GL_PIXEL_PACK_BUFFER, 0);