我开始学习Direct3D 12,并且难以理解CPU-GPU同步。据我了解,fence(ID3D12Fence)只是用作计数器的UINT64(无符号long long)值。但是它的方法使我感到困惑。以下是D3D12示例中的部分源代码。(https://github.com/d3dcoder/d3d12book)
void D3DApp::FlushCommandQueue()
{
// Advance the fence value to mark commands up to this fence point.
mCurrentFence++;
// Add an instruction to the command queue to set a new fence point. Because we
// are on the GPU timeline, the new fence point won't be set until the GPU finishes
// processing all the commands prior to this Signal().
ThrowIfFailed(mCommandQueue->Signal(mFence.Get(), mCurrentFence));
// Wait until the GPU has completed commands up to this fence point.
if(mFence->GetCompletedValue() < mCurrentFence)
{
HANDLE eventHandle = CreateEventEx(nullptr, false, false, EVENT_ALL_ACCESS);
// Fire event when GPU hits current fence.
ThrowIfFailed(mFence->SetEventOnCompletion(mCurrentFence, eventHandle));
// Wait until the GPU hits current fence event is fired.
WaitForSingleObject(eventHandle, INFINITE);
CloseHandle(eventHandle);
}
}
据我了解,这部分试图“刷新”命令队列,这基本上是使CPU等待GPU,直到达到给定的“ Fence值”,以使CPU和GPU具有相同的篱笆值。
问。如果此Signal()是让GPU更新给定ID3D12Fence内部的fence值的函数,为什么需要mCurrentFence值?
根据Microsoft Doc,它说“将围栅更新为指定值”。什么指定值?我需要的是“获取最后完成的命令列表值”,而不是设置或指定。此指定值是什么?
对我来说,似乎必须像
// Suppose mCurrentFence is 1 after submitting 1 command list (Index 0), and the thread reached to here for the FIRST time
ThrowIfFailed(mCommandQueue->Signal(mFence.Get()));
// At this point Fence value inside mFence is updated
if (m_Fence->GetCompletedValue() < mCurrentFence)
{
...
}
如果m_Fence-> GetCompletedValue()为0,
如果(0 <1)
GPU尚未操作命令列表(索引0),因此CPU必须等待直到GPU跟进为止。然后调用SetEventOnCompletion,WaitForSingleObject等很有意义。
如果(1 <1)
GPU已完成命令列表(索引0),因此CPU无需等待。
在执行命令列表的某处增加mCurrentFence。
mCommandQueue->ExecuteCommandLists(_countof(cmdsLists), cmdsLists);
mCurrentFence++;
答案 0 :(得分:1)
mCommandQueue->Signal(mFence.Get(), mCurrentFence)
会将fence值设置为mCurrentFence
。在这种情况下,“指定值”是mCurrentFence。
在开始时,fence和mCurrentFence的值都设置为0。接下来,mCurrentFence设置为1。然后执行mCommandQueue->Signal(mFence.Get(), 1)
,一旦在那个队列。最后,我们先叫mFence->SetEventOnCompletion(1, eventHandle)
,然后叫WaitForSingleObject
,直到栅栏设置为1。
将1替换为2,以进行下一次迭代,依此类推。
请注意,mCommandQueue->Signal
是非阻塞操作,只有在执行了所有其他gpu命令之后,才会立即设置fence的值。您可以假设m_Fence->GetCompletedValue() < mCurrentFence
在此示例中始终为真。
为什么需要mCurrentFence值?
我想它不一定是必需的,但是通过这样跟踪围栏值可以避免额外的API调用。在这种情况下,您也可以这样做:
// retrieve last value of the fence and increment by one (Additional API call)
auto nextFence = mFence->GetCompletedValue() + 1;
ThrowIfFailed(mCommandQueue->Signal(mFence.Get(), nextFence));
// Wait until the GPU has completed commands up to this fence point.
if(mFence->GetCompletedValue() < nextFence)
{
HANDLE eventHandle = CreateEventEx(nullptr, false, false, EVENT_ALL_ACCESS);
ThrowIfFailed(mFence->SetEventOnCompletion(nextFence, eventHandle));
WaitForSingleObject(eventHandle, INFINITE);
CloseHandle(eventHandle);
}
答案 1 :(得分:0)
作为Felix回答的补充:
跟踪篱笆值(例如mCurrentFence
)对于在命令队列中等待更特定的点很有用。
例如,假设我们正在使用此设置:
ComPtr<ID3D12CommandQueue> queue;
ComPtr<ID3D12Fence> queueFence;
UINT64 fenceVal = 0;
UINT64 incrementFence()
{
fenceVal++;
queue->Signal(queueFence.Get(), fenceVal); // CHECK HRESULT
return fenceVal;
}
void waitFor(UINT64 fenceVal, DWORD timeout = INFINITE)
{
if (queueFence->GetCompletedValue() < fenceVal)
{
queueFence->SetEventOnCompletion(fenceVal, fenceEv); // CHECK HRESULT
WaitForSingleObject(fenceEv, timeout);
}
}
然后我们可以执行以下操作(伪):
SUBMIT COMMANDS 1
cmds1Complete = incrementFence();
.
. <- CPU STUFF
.
SUBMIT COMMANDS 2
cmds2Complete = incrementFence();
.
. <- CPU STUFF
.
waitFor(cmds1Complete)
.
. <- CPU STUFF (that needs COMMANDS 1 to be complete,
but COMMANDS 2 is NOT required to be completed [but also could be])
.
waitFor(cmds2Complete)
.
. <- EVERYTHING COMPLETE
.
由于我们跟踪fenceVal
,所以我们还可以拥有一个flush
函数,该函数只等待跟踪的fenceVal
(而不是从增量栅栏返回的值),这实际上就是您在FlushCommandQueue
中拥有,因为它可以内嵌信号,所以它将始终是最新的值(这就是为什么Felix所说的,它只是保存了API调用):
void flushCmdQueue()
{
waitFor(incrementFence());
}
此示例比初始问题要复杂一些,但是,我认为在询问跟踪mCurrentFence
时很重要。