为什么我必须在CNN中将一个图像重塑为[n,高度,宽度,通道]

时间:2018-01-29 20:25:56

标签: python tensorflow conv-neural-network reshape tensor

我尝试将卷积图层应用于形状 [256,256,3] 的图片 当我直接使用图像的张量时出现错误

conv1 = conv2d(input,W_conv1) +b_conv1  #<=== error 

错误消息:

ValueError: Shape must be rank 4 but is rank 3 for 'Conv2D' (op: 'Conv2D') 
with input shapes: [256,256,3], [3,3,3,1].    

当我重塑函数conv2d正常工作时

x_image = tf.reshape(input,[-1,256,256,3])
conv1 = conv2d(x_image,W_conv1) +b_conv1

如果我必须重塑张量,那么在我的情况下重塑的最佳价值是什么?

import tensorflow as tf
import numpy as np
from PIL import Image

def img_to_tensor(img) :
    return tf.convert_to_tensor(img, np.float32)

def weight_generater(shape):
    return tf.Variable(tf.truncated_normal(shape,stddev=0.1))

def bias_generater(shape):
    return tf.Variable(tf.constant(.1,shape=shape))

def conv2d(x,W):
    return tf.nn.conv2d(x,W,[1,1,1,1],'SAME')

def pool_max_2x2(x):
    return tf.nn.max_pool(x,ksize=[1,2,2,1],strides=[1,1,1,1],padding='SAME')

#read image
img = Image.open("img.tif")

sess = tf.InteractiveSession()

#convetir image to tensor
input = img_to_tensor(img).eval()
#print(input)

# get img dimension
img_dimension = tf.shape(input).eval()
print(img_dimension)

height,width,channel=img_dimension
filter_size = 3
feature_map = 32

x = tf.placeholder(tf.float32,shape=[height*width*channel])
y = tf.placeholder(tf.float32,shape=21)

# generate weigh [kernal size, kernal size,channel,number of filters]
W_conv1 = weight_generater([filter_size,filter_size,channel,1])

#for each filter W has his  specific bais
b_conv1 = bias_generater([feature_map])

""" I must reshape the picture
x_image = tf.reshape(input,[-1,256,256,3])
"""
conv1 = conv2d(input,W_conv1) +b_conv1  #<=== error

h_conv1 = tf.nn.relu(conv1)

h_pool1 = pool_max_2x2(h_conv1)

layer1_dimension = tf.shape(h_pool1).eval()

print(layer1_dimension)

1 个答案:

答案 0 :(得分:6)

第一个维度是批量大小。如果您一次输入1个图像,您可以简单地创建第一个维度1并且不会更改任何数据,只需将索引更改为4D:

 function countWord(){
  var words = [];
  var count = [];

    for( var i = 0; i < array.length; i++ ){
 }

}

如果您在第一维中使用x_image = tf.reshape(input, [1, 256, 256, 3]) 对其进行整形,那么您所做的就是说您将输入4D批图像(形状为-1),并允许批量大小为是动态的(这是常见的)。