Question

参考网上资料，我在pytorch下使用如下代码实现DepthWise_Conv2d。

使用torchinfo的总结测试性能，发现运行速度和工艺参数量都非常巨大。

Depthwise_Con2d

m = nn.Conv2d(16, 16, 3, stride=1, groups=16, dilation=1,bias=False)
n = nn.Conv2d(16, 33, 1, stride=1,  dilation=1, bias=False)

                    Name    Self CPU %      Self CPU   CPU total %     CPU total  CPU time avg       CPU Mem  Self CPU Mem    # of Calls

aten::mkldnn_convolution        90.40%      13.248ms        90.88%      13.318ms       6.659ms      17.59 Mb           0 b             2
         model_inference         8.38%       1.228ms        99.81%      14.627ms      14.627ms          -4 b     -17.59 Mb             1
             aten::empty         0.47%      69.400us         0.47%      69.400us      17.350us      17.59 Mb      17.59 Mb             4
      aten::_convolution         0.24%      35.000us        91.12%      13.353ms       6.677ms      17.59 Mb           0 b             2
            aten::conv2d         0.19%      27.900us        91.42%      13.397ms       6.699ms      17.59 Mb           0 b             2
             aten::zeros         0.12%      17.800us         0.19%      27.600us      27.600us           4 b           0 b             1
       aten::convolution         0.11%      16.200us        91.23%      13.369ms       6.685ms      17.59 Mb           0 b             2
       aten::as_strided_         0.07%      10.300us         0.07%      10.300us       5.150us           0 b           0 b             2
             aten::zero_         0.02%       2.600us         0.02%       2.600us       2.600us           0 b           0 b             1

自身 CPU 总时间：14.655ms

                    Layer (type:depth-idx)                   Output Shape              Param #
                    ├─Conv2d: 1-1                            [20, 16, 48, 98]          144
                    ├─Conv2d: 1-2                            [20, 33, 48, 98]          528

总参数：672
可训练参数：672
不可训练的参数：0
总乘数 (M)：3.16

输入大小 (MB)：6.10
向前/向后传递大小 (MB)：35.17
参数大小 (MB)：0.00
估计总大小 (MB)：41.28

Conv2d 摘要

                    Name    Self CPU %      Self CPU   CPU total %     CPU total  CPU time avg       CPU Mem  Self CPU Mem    # of Calls

aten::mkldnn_convolution        90.64%       9.066ms        91.02%       9.104ms       9.104ms      11.84 Mb           0 b             1
         model_inference         8.16%     815.700us        99.71%       9.973ms       9.973ms          -4 b     -11.84 Mb             1
             aten::empty         0.44%      43.700us         0.44%      43.700us      14.567us      11.84 Mb      11.84 Mb             3
      aten::_convolution         0.23%      23.500us        91.25%       9.127ms       9.127ms      11.84 Mb           0 b             1
            aten::conv2d         0.19%      19.000us        91.53%       9.155ms       9.155ms      11.84 Mb           0 b             1
             aten::zeros         0.18%      18.100us         0.29%      28.900us      28.900us           4 b           0 b             1
       aten::convolution         0.09%       9.200us        91.34%       9.136ms       9.136ms      11.84 Mb           0 b             1
       aten::as_strided_         0.04%       4.000us         0.04%       4.000us       4.000us           0 b           0 b             1
             aten::zero_         0.03%       3.000us         0.03%       3.000us       3.000us           0 b           0 b             1

                    Layer (type:depth-idx)                   Output Shape              Param #
                    └─Conv2d: 0-1                            [20, 33, 48, 98]          4,785

总参数：4,785
可训练参数：4,785
不可训练的参数：0
总乘数 (M)：22.35
输入大小 (MB)：6.10
向前/向后传递大小 (MB)：23.69
参数大小 (MB)：0.02
估计总大小 (MB)：29.81

你能告诉我出了什么问题吗？

为什么在 Pytorch 中用 group 参数实现的 depthwise_conv2d 比 Conv2d 慢？

0 个答案: