我有一个高度不平衡的3D数据集,其中约80%的体积是背景数据,我只对前景元素感兴趣,这些元素在随机位置约占总体积的20%。这些位置记录在给网络的标签张量中。目标张量是二进制的,其中0表示背景,而1表示我们感兴趣或要分割的区域。
每个卷的大小为[30,512,1024]
。我正在使用大小为[30,64,64]
的块遍历每个卷。因此,我的大多数块在目标张量中只有0值。
我了解到DiceLoss
非常适合此类问题,已成功用于3D MRI扫描分割中。一个简单的实现从这里开始:https://github.com/pytorch/pytorch/issues/1249#issuecomment-305088398
def dice_loss(input, target):
smooth = 1.
iflat = input.view(-1)
tflat = target.view(-1)
intersection = (iflat * tflat).sum()
return 1 - ((2. * intersection + smooth) /
(iflat.sum() + tflat.sum() + smooth))
这对我不起作用,我的意思是对于一个补丁,其中我所有的背景都是tflat.sum()
将是0
。这也将使intersection
0
,因此,对于我的大多数补丁或块,我将得到1
的回报。
这是对的吗?这不是应该如何工作的。但是我为此感到苦恼,因为这是我的网络输出:
idx: 0 of 312 - Training Loss: 1.0 - Training Accuracy: 3.204042239857152e-11
idx: 5 of 312 - Training Loss: 0.9876335859298706 - Training Accuracy: 0.0119545953348279
idx: 10 of 312 - Training Loss: 1.0 - Training Accuracy: 7.269467666715101e-11
idx: 15 of 312 - Training Loss: 0.7320756912231445 - Training Accuracy: 0.22638492286205292
idx: 20 of 312 - Training Loss: 0.3599294424057007 - Training Accuracy: 0.49074622988700867
idx: 25 of 312 - Training Loss: 1.0 - Training Accuracy: 1.0720428988975073e-09
idx: 30 of 312 - Training Loss: 1.0 - Training Accuracy: 1.19782361807097e-09
idx: 35 of 312 - Training Loss: 1.0 - Training Accuracy: 1.956790285362331e-09
idx: 40 of 312 - Training Loss: 1.0 - Training Accuracy: 1.6055999862985004e-09
idx: 45 of 312 - Training Loss: 1.0 - Training Accuracy: 7.580232552761856e-10
idx: 50 of 312 - Training Loss: 1.0 - Training Accuracy: 9.510597864803572e-10
idx: 55 of 312 - Training Loss: 1.0 - Training Accuracy: 1.341515676323013e-09
idx: 60 of 312 - Training Loss: 0.7165247797966003 - Training Accuracy: 0.02658153884112835
idx: 65 of 312 - Training Loss: 1.0 - Training Accuracy: 4.528208030762926e-09
idx: 70 of 312 - Training Loss: 0.3205708861351013 - Training Accuracy: 0.6673439145088196
idx: 75 of 312 - Training Loss: 0.9305377006530762 - Training Accuracy: 2.3437689378624782e-05
idx: 80 of 312 - Training Loss: 1.0 - Training Accuracy: 5.305786885401176e-07
idx: 85 of 312 - Training Loss: 1.0 - Training Accuracy: 4.0612556517771736e-07
idx: 90 of 312 - Training Loss: 0.8207412362098694 - Training Accuracy: 0.0344742126762867
idx: 95 of 312 - Training Loss: 0.7463213205337524 - Training Accuracy: 0.19459737837314606
idx: 100 of 312 - Training Loss: 1.0 - Training Accuracy: 4.863646818620282e-09
idx: 105 of 312 - Training Loss: 0.35790306329727173 - Training Accuracy: 0.608722984790802
idx: 110 of 312 - Training Loss: 1.0 - Training Accuracy: 3.3852198821904267e-09
idx: 115 of 312 - Training Loss: 1.0 - Training Accuracy: 1.5268487585373691e-09
idx: 120 of 312 - Training Loss: 1.0 - Training Accuracy: 3.46353523639209e-09
idx: 125 of 312 - Training Loss: 1.0 - Training Accuracy: 2.5878148582347826e-11
idx: 130 of 312 - Training Loss: 1.0 - Training Accuracy: 2.3601216467272756e-11
idx: 135 of 312 - Training Loss: 1.0 - Training Accuracy: 1.1504343033763575e-09
idx: 140 of 312 - Training Loss: 0.4516671299934387 - Training Accuracy: 0.13879922032356262
我不认为网络可以从中学到任何东西。
现在我很困惑,因为我的问题应该不会太复杂,因为我确信MRI扫描也具有目标张量,并且其中大多数都表示背景。我在做什么错了?
谢谢
答案 0 :(得分:1)
如果您的算法预测所有背景体素的值都应该恰好为0,那么您将获得1的回报,但是如果预测任何正值(如果您使用S型激活,则肯定会得到),它仍然可以改善通过输出尽可能少的损耗。换句话说,分子不能超过smooth
,但算法仍可以学习使分母尽可能小。
如果您对算法的行为不满意,则可以尝试增加批处理大小(这样,所有卷都不会出现前景下降的机会)或直接跳过此类批处理。它可能会或可能不会帮助学习。
话虽这么说,但我个人从未成功使用Dice / IoU作为损失函数来学习细分,并且通常选择二进制交叉熵或类似的损失,而将前者作为验证指标。