Question

我想添加一个Image变换（我称之为ResizeTransformer）

将图像的较小尺寸调整为给定尺寸，同时保留原始宽高比

要在不实现单独的ResizeTransformer的情况下实现此目的，我想在this file中修改类ScaleTransformer : public ImageTransformerBase类但是，此类实现StreamInformation ScaleTransformer::Transform(const StreamInformation& inputStream)，目的是转换流，以便所有样本具有相同的大小。我的疑问是：

为什么需要实现此功能？这是否会增加任何性能优势，或者这对于更基本的目的是否重要？
我是否必须将ResizeTransformer()作为单独的类实现？
在这种情况下，我是否必须实施StreamInformation ResizeTransformer::Transform(const StreamInformation& inputStream？

需要进行此转换 需要进行此转换，因为一个数据集中的所有图像可能具有不同的大小，并且有人可能希望从每个图像中提取多个补丁。在这种情况下，最佳解决方案是将图像的较小尺寸调整为特定尺寸S，该尺寸大于裁剪尺寸C，然后从中提取多个尺寸为C的贴片它。在我所知道的某些论文中实践了这种数据增强。

PS： 为了添加ResizeTransformer

我对如何测试它很困惑。编译在C ++中是成功的，这意味着c ++代码是正确的。但是我想在python中使用它。

在我的系统中添加header file： `

class ResizeTransformer : public ImageTransformerBase
 {
 public:
   explicit ResizeTransformer(const Microsoft::MSR::CNTK::ConfigParameters& config);

 private:
   enum class ResizeMode
   {
     ResizeMin = 0,
     ResizeMax = 0
    };

   ResizeMode resize_mode;
   size_t resized_length;
   void Apply(uint8_t copyId, cv::Mat &mat) override;
 };

到source file：

ResizeTransformer::ResizeTransformer(const ConfigParameters& config) : ImageTransformerBase(config)
{
  resized_length = config(L"resized_length");
  if (resized_length <= 0)
    RuntimeError("Cannot resize any dimension of an image to zero or negative number.");

  string resize_type = config(L"resize_type", "ResizeMin");
  if (resize_type == "ResizeMin")
    resize_mode = ResizeMode::ResizeMin;
  else if (resize_type == "ResizeMax")
    resize_mode = ResizeMode::ResizeMax;
  else RuntimeError("Invalid resize_type. Must be one of ResizeMin and ResizeMax");
}

void ResizeTransformer::Apply(uint8_t, cv::Mat &mat)
{
  float height = mat.rows;
  float width = mat.cols;
  float aspectratio = height/width;
  float newheight{};
  float newwidth{};
  if (resize_mode == ResizeMode::ResizeMin)
    {
      if(height <=width)
    {
      newheight = resized_length;
      newwidth = newheight/aspectratio;
    }
      else
    {
      newheight = aspectratio * resized_length;
      newwidth = resized_length;
    }
    }
  else
    {
      if(height <=width)
    {
      newheight = aspectratio * resized_length;
      newwidth = resized_length;
    }
      else
    {
      newheight = resized_length;
      newwidth = newheight/aspectratio;
    }
    }
  resize(mat, mat, cv::Size2f(newwidth, newheight));
}

我将以下行添加到this file

transformations.push_back(Transformation{ std::make_shared<ResizeTransformer>(featureStream), featureName });

然后我将以下内容添加到this file

CNTK_API ImageTransform ReaderResize(int resized_length,
                                         const wchar_t* resize_type = L"ResizeMin");

最后，我将以下功能添加到this file

def resize(resized_length, resize_type='ResizeMin'):
    '''
    Resize transform that can be used to pass to `map_features`
    Given an input image, it will resize a given dimension to
    a fixed size (resized_length), while preserving the aspect ratio.


    Args:
        resized_length (int): A positive integer. It is the resized value of the
           dimension which has to be resized. The other dimension is resized while
           maintaining the aspect ratio.
        resize_type (str, default 'ResizeMin'): 'ResizeMin' or 'ResizeMax'.
           When 'ResizeMin', the smaller dimension of the image is resized to a fixed size
           given by resized_length, with the larger dimension resized in a way to preserve
           the priginal aspect ratio. When 'ResizeMax', the same operation is performed
           but now the larger dimension of the image is resized to a fixed size.
   Returns:
       A dictionary like object describing the ResizeTransform.
    '''
    return cntk_py.reader_resize(resized_length, resize_type)

Answer 1

1）如果可能，这允许上层提前定义缓冲区。因此，如果您知道将调整为（x，y） - 那么您可以在那里定义输出流形状（类似于ScaleTransform）。否则 - 您可以在Transform（SequenceDataPtr）/（如果使用ImageBaseTranform类时应用）方法中设置图像布局。

2）您可以，或者您可以更改ScaleTransformer以执行您需要的操作（只需在配置中使用其他参数）。

3）如果您实现自己的ResizeTranformer - 您可以简单地将NDShape :: Unknown放入转换中，例如：

StreamInformation ResizeTranformer::Transform(
    const StreamInformation& inputStream) 
{
     TransformBase::Transform(inputStream);
     m_outputStream.m_sampleLayout = NDShape::Unknown();
     return m_outputStream; 
}

<强> PS 即可。代码看起来不错，但您可能仍需要在inputStream上添加Transform，如上所述。另请注意，当图像到达核心网络时，所有图像都应具有相同的尺寸。反序列化器不支持不同形状的图像。

如果要公开ResizeTransformer，则需要执行以下操作：

1）实施ResizerTranformer（，如上所述，你做了）

2）在ImageReader / Exports.cpp中，按名称将解析添加到CreateTransformer函数，即

else if (type == L"Resize")
        *transformer = new ResizeTransformer(config);

（这个似乎缺少你的一面）

3）将工厂方法添加到CNTKLibrary.h / MinibatchSource.cpp中的C ++ API，作为示例参见scale transform（ReaderScale）:( 你做了） ImageTransform ReaderResize（...）{...}

4）在bindings / python / cntk / io / transforms.py（你做过）中检查params等实现一个python包装器 def resize（...）：

然后，如果您重新编译并将PATH设置为CNTK的本地构建（/ x64 / Release）和PYTHON_PATH到/ binding / python，您应该能够使用新的转换。您可以将测试添加到io / tests，然后转到/ binding / python / cntk并运行“pytest”。

我本可以忘记一些事情，所以如果你遇到任何问题，请向CNTK团队询问，他们应该能够提供帮助。

谢谢！

如何将图像大小调整转换添加到CNTK

1 个答案: