我使用提供的ImageReader训练了一个网络,现在,我正在尝试在C#项目中使用CNTK EvalDll来评估RGB图像。
我见过与EvalDll相关的示例,但输入始终是float / double数组,而不是图像。
如何使用公开的界面将受过训练的网络与RGB图像一起使用?
答案 0 :(得分:2)
我假设您希望与features=[
width=224
height=224
channels=3
cropType=Center
]
相同的阅读,其中您的阅读器配置类似于
System.Drawing.Bitmap
您需要帮助函数来创建裁剪,并将图像重新调整为网络接受的大小。
我将定义2个open System.Collections.Generic
open System.Drawing
open System.Drawing.Drawing2D
open System.Drawing.Imaging
type Bitmap with
/// Crops the image in the present object, starting at the given (column, row), and retaining
/// the given number of columns and rows.
member this.Crop(column, row, numCols, numRows) =
let rect = Rectangle(column, row, numCols, numRows)
this.Clone(rect, this.PixelFormat)
/// Creates a resized version of the present image. The returned image
/// will have the given width and height. This may distort the aspect ratio
/// of the image.
member this.ResizeImage(width, height, useHighQuality) =
// Rather than using image.GetThumbnailImage, use direct image resizing.
// GetThumbnailImage throws odd out-of-memory exceptions on some
// images, see also
// http://stackoverflow.com/questions/27528057/c-sharp-out-of-memory-exception-in-getthumbnailimage-on-a-server
// Use the interpolation method suggested on
// http://stackoverflow.com/questions/1922040/resize-an-image-c-sharp
let rect = Rectangle(0, 0, width, height);
let destImage = new Bitmap(width, height);
destImage.SetResolution(this.HorizontalResolution, this.VerticalResolution);
use graphics = Graphics.FromImage destImage
graphics.CompositingMode <- CompositingMode.SourceCopy;
if useHighQuality then
graphics.InterpolationMode <- InterpolationMode.HighQualityBicubic
graphics.CompositingQuality <- CompositingQuality.HighQuality
graphics.SmoothingMode <- SmoothingMode.HighQuality
graphics.PixelOffsetMode <- PixelOffsetMode.HighQuality
else
graphics.InterpolationMode <- InterpolationMode.Low
use wrapMode = new ImageAttributes()
wrapMode.SetWrapMode WrapMode.TileFlipXY
graphics.DrawImage(this, rect, 0, 0, this.Width,this.Height, GraphicsUnit.Pixel, wrapMode)
destImage
的扩展方法,一个用于裁剪,另一个用于重新调整大小:
/// Returns a square sub-image from the center of the given image, with
/// a size that is cropRatio times the smallest image dimension. The
/// aspect ratio is preserved.
let CenterCrop cropRatio (image: Bitmap) =
let cropSize =
float(min image.Height image.Width) * cropRatio
|> int
let startRow = (image.Height - cropSize) / 2
let startCol = (image.Width - cropSize) / 2
image.Crop(startCol, startRow, cropSize, cropSize)
基于此,定义一个执行中心裁剪的函数:
/// Creates a list of CNTK feature values from a given bitmap.
/// The image is first resized to fit into an (targetSize x targetSize) bounding box,
/// then the image planes are converted to a CNTK tensor.
/// Returns a list with targetSize*targetSize*3 values.
let ImageToFeatures (image: Bitmap, targetSize) =
// Apply the same image pre-processing that is typically done
// in CNTK when running it in test or write mode: Take a center
// crop of the image, then re-size it to the network input size.
let cropped = CenterCrop 1.0 image
let resized = cropped.ResizeImage(targetSize, targetSize, false)
// Ensure that the initial capacity of the list is provided
// with the constructor. Creating the list via the default constructor
// makes the whole operation 20% slower.
let features = List (targetSize * targetSize * 3)
// Traverse the image in the format that is used in OpenCV:
// First the B plane, then the G plane, R plane
for c in 0 .. 2 do
for h in 0 .. (resized.Height - 1) do
for w in 0 .. (resized.Width - 1) do
let pixel = resized.GetPixel(w, h)
let v =
match c with
| 0 -> pixel.B
| 1 -> pixel.G
| 2 -> pixel.R
| _ -> failwith "No such channel"
|> float32
features.Add v
features
然后将它们全部插入:裁剪,调整大小,然后按照OpenCV使用的平面顺序遍历图像:
ImageToFeatures
使用相关图片致电IEvaluateModelManagedF
,将结果反馈到myImage
的实例,然后您就可以了。我假设您的RGB图像位于let LoadModelOnCpu modelPath =
let model = new IEvaluateModelManagedF()
let description = sprintf "deviceId=-1\r\nmodelPath=\"%s\"" modelPath
model.Init description
model.CreateNetwork description
model
let model = LoadModelOnCpu("myModelFile")
let featureDict = Dictionary()
featureDict.["features"] <- ImageToFeatures(myImage, 224)
model.Evaluate(featureDict, "OutputNodes.z", 2)
,并且您正在进行二进制分类,网络大小为224 x 224.
ByteArrayOutputStream baos = new ByteArrayOutputStream();
ImageIO.write(bufferedImage, "png", baos);
baos.flush();
byte[] dataToReturn = baos.toByteArray();
答案 1 :(得分:2)
我在C#中实现了类似的代码,它在模型中加载,读取测试图像,进行适当的裁剪/缩放/等,并运行模型。正如安东所指出的那样,输出与CNTK的输出不完全匹配,但非常接近。
图像读取/裁剪/缩放代码:
private static Bitmap ImCrop(Bitmap img, int col, int row, int numCols, int numRows)
{
var rect = new Rectangle(col, row, numCols, numRows);
return img.Clone(rect, System.Drawing.Imaging.PixelFormat.DontCare);
}
/// Returns a square sub-image from the center of the given image, with
/// a size that is cropRatio times the smallest image dimension. The
/// aspect ratio is preserved.
private static Bitmap ImCropToCenter(Bitmap img, double cropRatio)
{
var cropSize = (int)Math.Round(Math.Min(img.Height, img.Width) * cropRatio);
var startCol = (img.Width - cropSize) / 2;
var startRow = (img.Height - cropSize) / 2;
return ImCrop(img, startCol, startRow, cropSize, cropSize);
}
/// Creates a resized version of the present image. The returned image
/// will have the given width and height. This may distort the aspect ratio
/// of the image.
private static Bitmap ImResize(Bitmap img, int width, int height)
{
return new Bitmap(img, new Size(width, height));
}
加载模型的代码和包含像素的xml文件意味着:
public static IEvaluateModelManagedF loadModel(string modelPath, string outputLayerName)
{
var networkConfiguration = String.Format("modelPath=\"{0}\" outputNodeNames=\"{1}\"", modelPath, outputLayerName);
Stopwatch stopWatch = new Stopwatch();
var model = new IEvaluateModelManagedF();
model.CreateNetwork(networkConfiguration, deviceId: -1);
stopWatch.Stop();
Console.WriteLine("Time to create network: {0} ms.", stopWatch.ElapsedMilliseconds);
return model;
}
/// Read the xml mean file, i.e. the offsets which are substracted
/// from each pixel in an image before using it as input to a CNTK model.
public static float[] readXmlMeanFile(string XmlPath, int ImgWidth, int ImgHeight)
{
// Read and parse pixel value xml file
XmlTextReader reader = new XmlTextReader(XmlPath);
reader.ReadToFollowing("data");
reader.Read();
var pixelMeansXml =
reader.Value.Split(new[] { "\r", "\n", " " }, StringSplitOptions.RemoveEmptyEntries)
.Select(Single.Parse)
.ToArray();
// Re-order mean pixel values to be in the same order as the bitmap
// image (as outputted by the getRGBChannels() function).
int inputDim = 3 * ImgWidth * ImgHeight;
Debug.Assert(pixelMeansXml.Length == inputDim);
var pixelMeans = new float[inputDim];
int counter = 0;
for (int c = 0; c < 3; c++)
for (int h = 0; h < ImgHeight; h++)
for (int w = 0; w < ImgWidth; w++)
{
int xmlIndex = h * ImgWidth * 3 + w * 3 + c;
pixelMeans[counter++] = pixelMeansXml[xmlIndex];
}
return pixelMeans;
}
加载图像并转换为模型输入的代码:
/// Creates a list of CNTK feature values from a given bitmap.
/// The image is first resized to fit into an (targetSize x targetSize) bounding box,
/// then the image planes are converted to a CNTK tensor, and the mean
/// pixel value substracted. Returns a list with targetSize * targetSize * 3 floats.
private static List<float> ImageToFeatures(Bitmap img, int targetSize, float[] pixelMeans)
{
// Apply the same image pre-processing that is done typically in CNTK:
// Take a center crop of the image, then re-size it to the network input size.
var imgCropped = ImCropToCenter(img, 1.0);
var imgResized = ImResize(imgCropped, targetSize, targetSize);
// Convert pixels to CNTK model input.
// Fast pixel extraction is ~5 faster while giving identical output
var features = new float[3 * imgResized.Height * imgResized.Width];
var boFastPixelExtraction = true;
if (boFastPixelExtraction)
{
var pixelsRGB = ImGetRGBChannels(imgResized);
for (int c = 0; c < 3; c++)
{
byte[] pixels = pixelsRGB[2 - c];
Debug.Assert(pixels.Length == imgResized.Height * imgResized.Width);
for (int i = 0; i < pixels.Length; i++)
{
int featIndex = i + c * pixels.Length;
features[featIndex] = pixels[i] - pixelMeans[featIndex];
}
}
}
else
{
// Traverse the image in the format that is used in OpenCV:
// First the B plane, then the G plane, R plane
// Note: calling GetPixel(w, h) repeatedly is slow!
int featIndex = 0;
for (int c = 0; c < 3; c++)
for (int h = 0; h < imgResized.Height; h++)
for (int w = 0; w < imgResized.Width; w++)
{
var pixel = imgResized.GetPixel(w, h);
float v;
if (c == 0)
v = pixel.B;
else if (c == 1)
v = pixel.G;
else if (c == 2)
v = pixel.R;
else
throw new Exception("");
// Substract pixel mean
features[featIndex] = v - pixelMeans[featIndex];
featIndex++;
}
}
return features.ToList();
}
/// Convert bitmap image to R,G,B channel byte arrays.
/// See: http://stackoverflow.com/questions/6020406/travel-through-pixels-in-bmp
private static List<byte[]> ImGetRGBChannels(Bitmap bmp)
{
// Lock the bitmap's bits.
Rectangle rect = new Rectangle(0, 0, bmp.Width, bmp.Height);
BitmapData bmpData = bmp.LockBits(rect, ImageLockMode.ReadWrite, PixelFormat.Format24bppRgb);
// Declare an array to hold the bytes of the bitmap.
int bytes = bmpData.Stride * bmp.Height;
byte[] rgbValues = new byte[bytes];
byte[] r = new byte[bytes / 3];
byte[] g = new byte[bytes / 3];
byte[] b = new byte[bytes / 3];
// Copy the RGB values into the array, starting from ptr to the first line
IntPtr ptr = bmpData.Scan0;
Marshal.Copy(ptr, rgbValues, 0, bytes);
// Populate byte arrays
int count = 0;
int stride = bmpData.Stride;
for (int col = 0; col < bmpData.Height; col++)
{
for (int row = 0; row < bmpData.Width; row++)
{
int offset = (col * stride) + (row * 3);
b[count] = rgbValues[offset];
g[count] = rgbValues[offset + 1];
r[count++] = rgbValues[offset + 2];
}
}
bmp.UnlockBits(bmpData);
return new List<byte[]> { r, g, b };
}