什么是计算机视觉中图像分割任务的标签

时间:2018-07-17 00:29:38

标签: computer-vision mask image-segmentation pytorch cross-entropy

我最近一直在进行一些图像分割任务,并希望从头开始应用一个。

据我所知,分段是每个像素预测所属的区域-对象实例(事物),背景片段实例(事物)。

根据最新算法Mask RCNN所基于的COCO数据集:

  

事物是可计数的物体,例如人,动物,工具。材料类别是具有相似纹理或材质的无定形区域,例如草,天空,道路。

根据Mask Rcnn论文,最终分类是采用每个像素S形的二元交叉熵损失函数(以避免类内竞争)。该管道基于FRCNN对象检测管道的顶部,从那里获取兴趣区域(roi),并将它们传递给ROI-align类,以保持空间信息完整。

我感到困惑的是以下内容。下面给出一个非常简单的代码段,以将Binary Cross Entropy损失应用于分离3个完全连接的层(一些带有比例的随机实验):

class ModelMain(nn.Module):
    def __init__(self, config, is_training=True):
        super(ModelMain, self).__init__()
        self.fc_1 = torch.nn.Linear(incoming_size_1, outgoing_size_1)
        self.fc_2 = torch.nn.Linear(incoming_size_2, outgoing_size_2)
        self.fc_3 = torch.nn.Linear(incoming_size_3, outgoing_size_3)

    def forward(self, x):
        y_1 = F.sigmoid(self.fc_1(x)) 
        y_2 = F.sigmoid(self.fc_2(x)) 
        y_3 = F.sigmoid(self.fc_3(x)) 

        return y_1, y_2, y_3


model = ModelMain()
criterion = torch.nn.BCELoss(size_average = True) 
optimizer = torch.optim.SGD(model.parameters(), lr = 0.01)

def run_epoch():
    batchsize = 10
    for epoch in range(batchsize):
        # Find image segment predicted by running forward pass: 
        y_predicted_1, y_predicted_2, y_predicted_3  = model(batch_data_x)

        # Compute and print loss : 
        loss_1 = criterion(y_predicted_1, batch_data_y)
        loss_2 = criterion(y_predicted_2, batch_data_y)
        loss_3 = criterion(y_predicted_3, batch_data_y)

        print( "Epoch ", epoch, "Loss : ", loss_1, loss_2, loss_3)

        # Perform Backward pass : 
        optimizer.zero_grad()
        loss_1.backward()
        loss_2.backward()
        loss_3.backward()
        optimizer.step()

...在这里,我们到底提供什么标签?

从数据集中:

Formatted JSON Data

图片:

 {
       "license":2,
       "file_name":"000000000139.jpg",
       "coco_url":"http://images.cocodataset.org/val2017/000000000139.jpg",
       "height":426,
       "width":640,
       "date_captured":"2013-11-21 01:34:01",
       "flickr_url":"http://farm9.staticflickr.com/8035/8024364858_9c41dc1666_z.jpg",
       "id":139
    }

细分信息:

{
   "segments_info":[
      {
         "id":3226956,
         "category_id":1,
         "iscrowd":0,
         "bbox":[
            413,
            158,
            53,
            138
         ],
         "area":2840
      },
      {
         "id":6979964,
         "category_id":1,
         "iscrowd":0,
         "bbox":[
            384,
            172,
            16,
            36
         ],
         "area":439
      },
      {
         "id":3103374,
         "category_id":62,
         "iscrowd":0,
         "bbox":[
            413,
            223,
            30,
            81
         ],
         "area":1250
      },
      {
         "id":2831194,
         "category_id":62,
         "iscrowd":0,
         "bbox":[
            291,
            218,
            62,
            98
         ],
         "area":1848
      },
      {
         "id":3496593,
         "category_id":62,
         "iscrowd":0,
         "bbox":[
            412,
            219,
            10,
            13
         ],
         "area":90
      },
      {
         "id":2633066,
         "category_id":62,
         "iscrowd":0,
         "bbox":[
            317,
            219,
            22,
            12
         ],
         "area":212
      },
      {
         "id":3165572,
         "category_id":62,
         "iscrowd":0,
         "bbox":[
            359,
            218,
            56,
            103
         ],
         "area":2251
      },
      {
         "id":8824489,
         "category_id":64,
         "iscrowd":0,
         "bbox":[
            237,
            149,
            24,
            62
         ],
         "area":369
      },
      {
         "id":3032951,
         "category_id":67,
         "iscrowd":0,
         "bbox":[
            321,
            231,
            126,
            89
         ],
         "area":2134
      },
      {
         "id":2038814,
         "category_id":72,
         "iscrowd":0,
         "bbox":[
            7,
            168,
            149,
            95
         ],
         "area":13247
      },
      {
         "id":3289671,
         "category_id":72,
         "iscrowd":0,
         "bbox":[
            557,
            209,
            82,
            79
         ],
         "area":5846
      },
      {
         "id":2437710,
         "category_id":78,
         "iscrowd":0,
         "bbox":[
            512,
            206,
            15,
            16
         ],
         "area":224
      },
      {
         "id":4159376,
         "category_id":82,
         "iscrowd":0,
         "bbox":[
            493,
            174,
            20,
            108
         ],
         "area":2056
      },
      {
         "id":3423599,
         "category_id":84,
         "iscrowd":0,
         "bbox":[
            613,
            308,
            13,
            46
         ],
         "area":324
      },
      {
         "id":3094634,
         "category_id":84,
         "iscrowd":0,
         "bbox":[
            605,
            306,
            14,
            45
         ],
         "area":331
      },
      {
         "id":3296100,
         "category_id":85,
         "iscrowd":0,
         "bbox":[
            448,
            121,
            14,
            22
         ],
         "area":227
      },
      {
         "id":6054280,
         "category_id":86,
         "iscrowd":0,
         "bbox":[
            241,
            195,
            14,
            18
         ],
         "area":187
      },
      {
         "id":5942189,
         "category_id":86,
         "iscrowd":0,
         "bbox":[
            549,
            309,
            36,
            90
         ],
         "area":2171
      },
      {
         "id":4086154,
         "category_id":86,
         "iscrowd":0,
         "bbox":[
            351,
            209,
            11,
            22
         ],
         "area":178
      },
      {
         "id":7438777,
         "category_id":86,
         "iscrowd":0,
         "bbox":[
            337,
            200,
            10,
            16
         ],
         "area":120
      },
      {
         "id":3031159,
         "category_id":118,
         "iscrowd":0,
         "bbox":[
            0,
            269,
            564,
            157
         ],
         "area":49754
      },
      {
         "id":9284267,
         "category_id":119,
         "iscrowd":0,
         "bbox":[
            338,
            166,
            29,
            50
         ],
         "area":842
      },
      {
         "id":6068135,
         "category_id":130,
         "iscrowd":0,
         "bbox":[
            212,
            11,
            321,
            127
         ],
         "area":3391
      },
      {
         "id":2567230,
         "category_id":156,
         "iscrowd":0,
         "bbox":[
            129,
            168,
            351,
            162
         ],
         "area":5699
      },
      {
         "id":10334639,
         "category_id":181,
         "iscrowd":0,
         "bbox":[
            204,
            63,
            234,
            174
         ],
         "area":15587
      },
      {
         "id":6266027,
         "category_id":186,
         "iscrowd":0,
         "bbox":[
            136,
            0,
            473,
            116
         ],
         "area":20106
      },
      {
         "id":5274512,
         "category_id":188,
         "iscrowd":0,
         "bbox":[
            0,
            38,
            549,
            297
         ],
         "area":25483
      },
      {
         "id":7238567,
         "category_id":189,
         "iscrowd":0,
         "bbox":[
            457,
            350,
            183,
            76
         ],
         "area":9421
      },
      {
         "id":4224910,
         "category_id":199,
         "iscrowd":0,
         "bbox":[
            0,
            0,
            640,
            358
         ],
         "area":83201
      },
      {
         "id":6391959,
         "category_id":200,
         "iscrowd":0,
         "bbox":[
            135,
            359,
            336,
            67
         ],
         "area":12618
      }
   ],
   "file_name":"000000000139.png",
   "image_id":139
}

蒙版图像:

Sample image 139 from coco dataset 原始图片:

enter image description here

对于对象检测任务,我们有边界框,但是对于图像分割,我需要使用提供的遮罩来计算损耗。 那么上面的代码中batch_data_y的值应该是什么。 它是蒙版图像的向量吗?但是,这是否使我的网络了解某个网段是什么颜色?还是我缺少其他细分注释?

2 个答案:

答案 0 :(得分:1)

正如@hkchengrex在评论中提到的,蒙版图像中的颜色似乎是从真实图像中拾取的,这要么是巧合,要么是某些后处理以可视化的结果。

语义蒙版通常表示为/存储为图像,每个像素的值代表实际图片中的类别。例如,假设您正在考虑C类,则图片M的语义蒙版I可以表示为图像,其中M(i,j) = c表示像素{{1} }应该归类为属于语义类{{1}的{{1}中的I(i,j)c中的c,{{1}中的[0; C[} },尺寸为{{1}的i)。

现在,由于类彼此独立,因此网络预测它们的最佳方法是输出形状为[0, H[的概率图j,其中[0, W[代表估计的(H, W)属于类I的概率(在P(H, W, C)之间,因此是一个类似S型的激活函数)。

当您详细介绍自己时,通过这样的输出,您可以使用无损二进制交叉熵来训练您的网络-假设您预处理了真实的面具P(i,j,c),将其从{{ 1}}个图像中的值在0中( logs )进入1中,映射中的值在I(i,j)中。这种预处理称为“一次热转换”,可以使用scatter() c.f用Pytorch完成。此thread

c

但是,另一种解决方案(可能不适合您的问题(如果要避免softmax,因为它包括此操作))是使用(非二进制)交叉熵损失。 torch.nn.CrossEntropyLoss()将直接以M(形状为HxW)作为预测,而[0,C](形状为HxWxC)则作为目标。

答案 1 :(得分:0)

@Aldream的直觉是正确的,但明确地针对他们提供二进制掩码的coco数据集,其网站上的文档不是很好:

  

用于操作以RLE格式存储的蒙版的接口。

     

RLE是一种简单而有效的格式,用于存储二进制掩码。 RLE    首先将矢量(或矢量化图像)分成一系列分段    恒定的区域,然后为每块简单地存储长度    那块。例如,给定M = [0 0 1 1 1 0 1],RLE计数将    为[2 3 1 1],或者对于M = [1 1 1 1 1 1 0 0],计数为[0 6 1]    (请注意,奇数始终是零的数量)。代替    直接存储计数,使用    基于称为LEB128的通用方案的可变比特率表示。   来源:link

尽管我确实为平均二进制交叉熵损失编写了自己的自定义函数:

def l_cross_entropy2d(input, target, weight=None, size_average=True):

    n, c, h, w = input.size()
    nt, ct, ht, wt = target.size()

    # Handle inconsistent size between input and target
    if h > ht and w > wt: # upsample labels
        target = target.unsqueeze(1)
        target = F.upsample(target, size=(h, w), mode='nearest')
        target = target.sequeeze(1)
    elif h < ht and w < wt: # upsample images
        input = F.upsample(input, size=(ht, wt), mode='bilinear')
    elif h != ht and w != wt:
        raise Exception("Only support upsampling")

    # take per pixel sigmoid  
    sigm = F.sigmoid(input)
    # change dimension to create 2d matrix where rows -> pixels and columns -> classes
    # takes input tensor <n X c X h X w> outputs tensor < n*h*w X c >
    sigm = sigm.transpose(1, 2).transpose(2, 3).contiguous().view(-1, c)

    # change target to column tensor for calculating cross entropy and repeat it number of classes times
    # Get all values from sigmoid tensor >= 0 (all pixels that have value) 
    sigm = sigm[target.view(-1, 1).repeat(1, c) >= 0]
    sigm = sigm.view(-1, c)

    mask = target >= 0
    target = target[mask]
    loss = F.nll_loss(sigm, target, ignore_index=250,
                      weight=weight, size_average=False)
    if size_average:
        loss /= mask.data.sum()
    return loss