我试图以高效的方式将Tensorflow ssd_mobilenet_v1_coco模型转换为PyTorch模型,因此我获得了所有的Tensorflow层,并将它们映射到预定义MobileNetV1_SSD类的层中。
该类摘自https://github.com/qfgaohao/pytorch-ssd,并引用了官方论文中定义的标准MobileNetV1 SSD架构。问题是,当我将图层从TF映射到PyTorch时,出现了一些图层尺寸不匹配的情况,例如与初始体系结构相比是否有一些更改。 根据下载的模型调整班级,我得到了这个:
extras = ModuleList([
Sequential(
Conv2d(in_channels=1024, out_channels=256, kernel_size=1),
ReLU(),
Conv2d(in_channels=256, out_channels=512, kernel_size=3, stride=2, padding=1),
ReLU()
),
Sequential(
Conv2d(in_channels=512, out_channels=128,kernel_size=1),
ReLU(),
Conv2d(in_channels=128, out_channels=256, kernel_size=3, stride=2, padding=1),
ReLU()
),
Sequential(
Conv2d(in_channels=256, out_channels=128, kernel_size=1),
ReLU(),
Conv2d(in_channels=128, out_channels=256, kernel_size=3, stride=2, padding=1),
ReLU()
),
Sequential(
Conv2d(in_channels=256, out_channels=64, kernel_size=1),
ReLU(),
Conv2d(in_channels=64, out_channels=128, kernel_size=3, stride=2, padding=1),
ReLU()
)
])
regression_headers = ModuleList([
Conv2d(in_channels=512, out_channels=3 * 4, kernel_size=1, padding=1),
Conv2d(in_channels=1024, out_channels=6 * 4, kernel_size=1, padding=1),
Conv2d(in_channels=512, out_channels=6 * 4, kernel_size=1, padding=1),
Conv2d(in_channels=256, out_channels=6 * 4, kernel_size=1, padding=1),
Conv2d(in_channels=256, out_channels=6 * 4, kernel_size=1, padding=1),
Conv2d(in_channels=128, out_channels=6 * 4, kernel_size=1, padding=1),
])
classification_headers = ModuleList([
Conv2d(in_channels=512, out_channels=3 * num_classes, kernel_size=1, padding=1),
Conv2d(in_channels=1024, out_channels=6 * num_classes, kernel_size=1, padding=1),
Conv2d(in_channels=512, out_channels=6 * num_classes, kernel_size=1, padding=1),
Conv2d(in_channels=256, out_channels=6 * num_classes, kernel_size=1, padding=1),
Conv2d(in_channels=256, out_channels=6 * num_classes, kernel_size=1, padding=1),
Conv2d(in_channels=128, out_channels=6 * num_classes, kernel_size=1, padding=1),
])
确定:模型之间的翻译没有错误
失败:我不得不评估PyTorch模型时遇到一个问题,因为内核大小以及Extras / Classification / Regression中的某些in_channel和out_channels与初始体系结构规范不匹配
RuntimeError:在非单维度1时,张量a(2781)的大小必须与张量b(3000)的大小匹配
问题 是否有人对Tensorflow开发人员如何实现此模型有所了解?那么当选择锚点时,他们如何管理第一层的输出减少(从6 * num_classes减少到3 * num_classes)?