如何在DirectX / Direct3D 12中使用fence同步CPU和GPU?

时间:2019-10-24 10:52:45

标签: synchronization gpu directx direct3d12

我开始学习Direct3D 12,并且难以理解CPU-GPU同步。据我了解,fence(ID3D12Fence)只是用作计数器的UINT64(无符号long long)值。但是它的方法使我感到困惑。以下是D3D12示例中的部分源代码。(https://github.com/d3dcoder/d3d12book

void D3DApp::FlushCommandQueue()
{
    // Advance the fence value to mark commands up to this fence point.
    mCurrentFence++;

    // Add an instruction to the command queue to set a new fence point.  Because we 
    // are on the GPU timeline, the new fence point won't be set until the GPU finishes
    // processing all the commands prior to this Signal().
    ThrowIfFailed(mCommandQueue->Signal(mFence.Get(), mCurrentFence));

    // Wait until the GPU has completed commands up to this fence point.
    if(mFence->GetCompletedValue() < mCurrentFence)
    {
        HANDLE eventHandle = CreateEventEx(nullptr, false, false, EVENT_ALL_ACCESS);

        // Fire event when GPU hits current fence.  
        ThrowIfFailed(mFence->SetEventOnCompletion(mCurrentFence, eventHandle));

        // Wait until the GPU hits current fence event is fired.
        WaitForSingleObject(eventHandle, INFINITE);
        CloseHandle(eventHandle);
    }
}

据我了解,这部分试图“刷新”命令队列,这基本上是使CPU等待GPU,直到达到给定的“ Fence值”,以使CPU和GPU具有相同的篱笆值。

问。如果此Signal()是让GPU更新给定ID3D12Fence内部的fence值的函数,为什么需要mCurrentFence值?

根据Microsoft Doc,它说“将围栅更新为指定值”。什么指定值?我需要的是“获取最后完成的命令列表值”,而不是设置或指定。此指定值是什么?

对我来说,似乎必须像

// Suppose mCurrentFence is 1 after submitting 1 command list (Index 0), and the thread reached to here for the FIRST time
ThrowIfFailed(mCommandQueue->Signal(mFence.Get()));
// At this point Fence value inside mFence is updated
if (m_Fence->GetCompletedValue() < mCurrentFence)
{
...
}

如果m_Fence-> GetCompletedValue()为0,

如果(0 <1)

GPU尚未操作命令列表(索引0),因此CPU必须等待直到GPU跟进为止。然后调用SetEventOnCompletion,WaitForSingleObject等很有意义。

如果(1 <1)

GPU已完成命令列表(索引0),因此CPU无需等待。

在执行命令列表的某处增加mCurrentFence。

mCommandQueue->ExecuteCommandLists(_countof(cmdsLists), cmdsLists);
mCurrentFence++;

2 个答案:

答案 0 :(得分:1)

一旦执行了命令队列中所有先前排队的命令,

mCommandQueue->Signal(mFence.Get(), mCurrentFence)会将fence值设置为mCurrentFence。在这种情况下,“指定值”是mCurrentFence。

在开始时,fence和mCurrentFence的值都设置为0。接下来,mCurrentFence设置为1。然后执行mCommandQueue->Signal(mFence.Get(), 1),一旦在那个队列。最后,我们先叫mFence->SetEventOnCompletion(1, eventHandle),然后叫WaitForSingleObject,直到栅栏设置为1。

将1替换为2,以进行下一次迭代,依此类推。

请注意,mCommandQueue->Signal是非阻塞操作,只有在执行了所有其他gpu命令之后,才会立即设置fence的值。您可以假设m_Fence->GetCompletedValue() < mCurrentFence在此示例中始终为真。

为什么需要mCurrentFence值?

我想它不一定是必需的,但是通过这样跟踪围栏值可以避免额外的API调用。在这种情况下,您也可以这样做:

// retrieve last value of the fence and increment by one (Additional API call)
auto nextFence = mFence->GetCompletedValue() + 1;
ThrowIfFailed(mCommandQueue->Signal(mFence.Get(), nextFence));

// Wait until the GPU has completed commands up to this fence point.
if(mFence->GetCompletedValue() < nextFence)
{
    HANDLE eventHandle = CreateEventEx(nullptr, false, false, EVENT_ALL_ACCESS);  
    ThrowIfFailed(mFence->SetEventOnCompletion(nextFence, eventHandle));
    WaitForSingleObject(eventHandle, INFINITE);
    CloseHandle(eventHandle);
}

答案 1 :(得分:0)

作为Felix回答的补充:

跟踪篱笆值(例如mCurrentFence)对于在命令队列中等待更特定的点很有用。

例如,假设我们正在使用此设置:

ComPtr<ID3D12CommandQueue> queue;
ComPtr<ID3D12Fence> queueFence;
UINT64 fenceVal = 0;

UINT64 incrementFence()
{
    fenceVal++;
    queue->Signal(queueFence.Get(), fenceVal); // CHECK HRESULT
    return fenceVal;
}

void waitFor(UINT64 fenceVal, DWORD timeout = INFINITE)
{
    if (queueFence->GetCompletedValue() < fenceVal)
    {
        queueFence->SetEventOnCompletion(fenceVal, fenceEv); // CHECK HRESULT
        WaitForSingleObject(fenceEv, timeout);
    }
}

然后我们可以执行以下操作(伪):

SUBMIT COMMANDS 1
cmds1Complete = incrementFence();
    .
    . <- CPU STUFF
    .
SUBMIT COMMANDS 2
cmds2Complete = incrementFence();
    .
    . <- CPU STUFF
    .
waitFor(cmds1Complete)
    .
    . <- CPU STUFF (that needs COMMANDS 1 to be complete,
      but COMMANDS 2 is NOT required to be completed [but also could be])
    .
waitFor(cmds2Complete)
    .
    . <- EVERYTHING COMPLETE
    .

由于我们跟踪fenceVal,所以我们还可以拥有一个flush函数,该函数只等待跟踪的fenceVal(而不是从增量栅栏返回的值),这实际上就是您在FlushCommandQueue中拥有,因为它可以内嵌信号,所以它将始终是最新的值(这就是为什么Felix所说的,它只是保存了API调用):

void flushCmdQueue()
{
    waitFor(incrementFence());
}

此示例比初始问题要复杂一些,但是,我认为在询问跟踪mCurrentFence时很重要。