Question

我正在学习Compute Shaders并尝试使用Unity（我不太熟悉着色器），我正在尝试做一些简单的光线投射，并从Compute Shader写入渲染纹理。一切都很完美，我得到了想要的结果。光线三角形交叉点发生得非常快 - 只有不到半秒钟。但是，此刻我尝试将新颜色应用于渲染纹理，性能会中断。所需时间跃升至5秒。我不能在没有恶化性能的情况下突破循环。我甚至无法在循环中使用bool标志，如果在循环中将其设置为true，我可以在循环外使用它来更新纹理颜色。

表现非常糟糕。我如何更新渲染纹理颜色？

以下是着色器代码：感谢任何帮助。

//--------------------------------------------------------------------
#pragma kernel MainCS

//--------------------------------------------------------------------
struct Triangle
{
    float3 v0;
    float3 v1;
    float3 v2;
    float3 n;
};

// Precomputed and set from C# script
struct Pixel
{
    float3   position;
    float3   direction;
    int      index;
    float    pixelColor;
};

//-----------------------------------------------------------------------------
#define blocksize 8

// variables
int imageSize;

// buffers
RWStructuredBuffer<Pixel>        pixels    : register(u0); // UAV
RWTexture2D<float4>              rendTex   : register(u1); // UAV
const StructuredBuffer<Triangle> tris      : register(t0); // SRV


// This kernel writes some color in the current pixel if there is ray intersection with some of the triangles from the tris buffer.  In general works well but slow. The intersection part without writing to the render texture is SUPER FAST. When i attempt to write to the texture - gets SUPER SLOW. Render Texture random write is enabled from the C# script

[numthreads(blocksize,blocksize,1)]
void MainCS (uint3 id : SV_DispatchThreadID, uint3 Gid : SV_GroupID, uint3 GTid : SV_GroupThreadID, uint GI : SV_GroupIndex )
{
    // Get the current pixel ID - pixels is 1D array
    int pixelID = (int)(id.y * imageSize + id.x);

    // Ray
    float3 rayO = pixels[pixelID].position;
    float3 rayD = pixels[pixelID].direction;

    // Intersection variables
    float3 pt0, pt1, pt2, edge0, edge1, edge2, cross1, cross2, cross3, n;
    float angle1, angle2, angle3;
    float r, _a, b;
    float3 w0, I;

    bool bIntersect = false;

    [loop][allow_uav_condition]
    for (uint tr = 0; tr < tris.Length; tr++)
    {
        // Somecalculations
        pt0 = tris[tr].v0; pt1 = tris[tr].v1; pt2 = tris[tr].v2;
        edge0 = rayO - pt0; edge1 = rayO - pt1; edge2 = rayO - pt2;

        // First check - is the ray intersecting the triangle
        if (dot(rayD, cross(edge0, edge1)) >= 0.0 ||
            dot(rayD, cross(edge1, edge2)) >= 0.0 ||
            dot(rayD, cross(edge2, edge0)) >= 0.0) continue;

        // Fiding the intersection point
        n = normalize(cross(pt0 - pt1, pt0 - pt2));
        w0 = rayO - pt0;
        _a = -dot(n, w0);
        b  =  dot(n, rayD);
        r  = _a / b;
        I = rayO + rayD * r;

        // Second check - before validate the hitpoint
        if (_a < 0.0)
        {
            // Here i would want to update texture colors

            // ==============================================
            // Variant 1 =======================================
            // Only update the texture without break;
            // Gives proper result but is SLOW - 3 seconds
            rendTex[id.xy] = float4(1.0, 0.0, 0.0, 1.0);
            // if add break; - MUCH SLOWER
            break;

            // ===============================================
            // Variant 2 - Part 1 ==================================
            // rising flag to true - fast
            if(!bIntersect)
            {
                bIntersect = true;
            }
        }
    }

// Variant 2 - Part 2 - When using the flag - updating Render texture colror is SUPER SLOW but acurate
    if(bIntersect)
        rendTex[id.xy] = float4(1.0, 0.0, 0.0, 1.0);
}

Answer 1

我假设你正在尝试制作类似绘图工具的东西，让你在表面上绘画。我之前已经构建了其中一个，但这是通过从Unity直接绘制纹理而不是从着色器绘制的。此外，除非您尝试捕捉其他相机的渲染然后在其上进行合成，否则您不需要将其作为渲染纹理。

着色器通常非常快，因为它们可以同时将多个像素的绘制并行化到绘制缓冲区。然而，写入纹理内存要慢得多。很可能是你的性能问题是由于着色器不断更新每帧的每个像素上的纹理。很多非常小的写操作。想象一下，通过打开文本文件，更新单个字符，然后重复关闭来编写小说。

我建议使用Texture2D.setPixels()直接在Unity中绘制纹理。它允许您通过接受Unity Color对象数组批量写入纹理内存，并仅在纹理上调用texture.Apply()时发送这些已修改的像素。

此外，如果您需要在纹理空间中获取UV坐标，那么RaycastHit.textureCoord。

这是Unity文档中提供的示例，用于根据光线投射到对象表面的位置绘制到纹理。

using UnityEngine;
using System.Collections;

public class ExampleClass : MonoBehaviour {
    public Camera cam;
    void Start() {
        cam = GetComponent<Camera>();
    }
    void Update() {
        if (!Input.GetMouseButton(0))
            return;

        RaycastHit hit;
        if (!Physics.Raycast(cam.ScreenPointToRay(Input.mousePosition), out hit))
            return;

        Renderer rend = hit.transform.GetComponent<Renderer>();
        MeshCollider meshCollider = hit.collider as MeshCollider;
        if (rend == null || rend.sharedMaterial == null || rend.sharedMaterial.mainTexture == null || meshCollider == null)
            return;

        Texture2D tex = rend.material.mainTexture as Texture2D;
        Vector2 pixelUV = hit.textureCoord;
        pixelUV.x *= tex.width;
        pixelUV.y *= tex.height;
        tex.SetPixel((int)pixelUV.x, (int)pixelUV.y, Color.black);
        tex.Apply();
    }
}

Answer 2

在GPU上编程时，动态分支非常昂贵。

这是因为GPU的设计方式。 CPU工作原理的简化视图：获取指令，解码指令，然后在ALU上执行指令。 GPU获取指令，对其进行解码，然后在一堆ALU上同时执行。它同时遍历每个线程上的每一行，并且需要为所有这些像素再次运行程序，即使这些线程中只有一个必须执行不同的指令。

基本上，尽可能避免动态分支（if语句）。当你使用条件中断执行for循环时，你会创建很多分支，这是GPU的Achille的脚跟。标志更快，因为GPU无论如何都能够在每个线程上执行所有这些指令。尝试让尽可能多的线程执行尽可能相同的代码行。

使用Compute Shaders编写在Unity中渲染纹理很慢？

2 个答案: