我是OpenGL的初学者,我试图每5秒从一个位置动画一个“对象”的数字。如果我在顶点着色器中计算位置,fps会急剧下降,不应该在GPU上进行这些类型的计算吗?
这是顶点着色器代码:
#version 300 es
precision highp float;
precision highp int;
layout(location = 0) in vec3 vertexData;
layout(location = 1) in vec3 colourData;
layout(location = 2) in vec3 normalData;
layout(location = 3) in vec3 personPosition;
layout(location = 4) in vec3 oldPersonPosition;
layout(location = 5) in int start;
layout(location = 6) in int duration;
layout(std140, binding = 0) uniform Matrices
{ //base //offset
mat4 projection; // 64 // 0
mat4 view; // 64 // 0 + 64 = 64
int time; // 4 // 64 + 64 = 128
bool shade; // 4 // 128 + 4 = 132 two empty slots after this
vec3 midPoint; // 16 // 128 + 16 = 144
vec3 cameraPos; // 16 // 144 + 16 = 160
// size = 160+16 = 176. Alligned to 16, becomes 176.
};
out vec3 vertexColour;
out vec3 vertexNormal;
out vec3 fragPos;
void main() {
vec3 scalePos;
scalePos.x = vertexData.x * 3.0;
scalePos.y = vertexData.y * 3.0;
scalePos.z = vertexData.z * 3.0;
vertexColour = colourData;
vertexNormal = normalData;
float startFloat = float(start);
float durationFloat = float(duration);
float timeFloat = float(time);
// Wrap around catch to avoid start being close to 1M but time has wrapped around to 0
if (startFloat > timeFloat) {
startFloat = startFloat - 1000000.0;
}
vec3 movePos;
float elapsedTime = timeFloat - startFloat;
if (elapsedTime > durationFloat) {
movePos = personPosition;
} else {
vec3 moveVector = personPosition - oldPersonPosition;
float moveBy = elapsedTime / durationFloat;
movePos = oldPersonPosition + moveVector * moveBy;
}
fragPos = movePos;
gl_Position = projection * view * vec4(scalePos + movePos, 1.0);
}
缓冲区每5秒更新一次:
glBindBuffer(GL_ARRAY_BUFFER, this->personPositionsVBO);
glBufferData(GL_ARRAY_BUFFER, sizeof(float) * this->persons.size() * 3, this->positions, GL_STATIC_DRAW);
glBindBuffer(GL_ARRAY_BUFFER, this->personOldPositionsVBO);
glBufferData(GL_ARRAY_BUFFER, sizeof(float) * this->persons.size() * 3, this->oldPositions, GL_STATIC_DRAW);
glBindBuffer(GL_ARRAY_BUFFER, this->timeStartVBO);
glBufferData(GL_ARRAY_BUFFER, sizeof(int) * this->persons.size(), animiationStart, GL_STATIC_DRAW);
glBindBuffer(GL_ARRAY_BUFFER, this->timeDurationVBO);
glBufferData(GL_ARRAY_BUFFER, sizeof(int) * this->persons.size(), animiationDuration, GL_STATIC_DRAW);
我做了一个测试计算CPU上的位置,并更新位置缓冲每次绘制调用,这不会给我一个性能下降,但感觉从根本上错了?
void PersonView::animatePositions() {
float duration = 1500;
double currentTime = now_ms();
double elapsedTime = currentTime - animationStartTime;
if (elapsedTime > duration) {
return;
}
for (int i = 0; i < this->persons.size() * 3; i++) {
float moveDistance = this->positions[i] - this->oldPositions[i];
float moveBy = (float)(elapsedTime / duration);
this->moveByPositions[i] = this->oldPositions[i] + moveDistance * moveBy;
}
glBindBuffer(GL_ARRAY_BUFFER, this->personMoveByPositionsVBO);
glBufferData(GL_ARRAY_BUFFER, sizeof(float) * this->persons.size() * 3, this->moveByPositions, GL_STATIC_DRAW);
}
在具有更好SOC的设备上:s(Snapdragon 835等),framedrop并不像具有中端SOC的设备那样剧烈:s(Snapdragon 625)
答案 0 :(得分:1)
马上,我可以看到你在顶点着色器中将投影和视图矩阵相乘,但是没有任何地方可以独立地依赖视图或投影矩阵。
将两个4x4矩阵相乘会导致对您绘制的每个顶点进行大量的算术计算。在你的情况下 - 似乎你可以避免这种情况。
而不是当前的实现 - 尝试将视图和proj矩阵乘以着色器外部,然后将得到的矩阵绑定为单个viewProjection矩阵:
Old:
gl_Position = projection * view * vec4(scalePos + movePos, 1.0);
New:
gl_Position = projectionView * vec4(scalePos + movePos, 1.0);
这样,proj和view矩阵每帧乘以一次,而不是每个顶点一次。此更改应该可以显着提高性能 - 尤其是在您有大量顶点的情况下。
一般来说,GPU在执行这样的算术计算时确实比CPU高效得多,但您还应该考虑计算量。顶点着色器是按顶点执行的 - 并且应该只计算顶点之间不同的东西。
在CPU上执行1次计算总是比在GPU上执行相同的n次计算(n =总顶点)更好。