我正在研究与图像有关的回归问题。通常,我必须根据从图像中提取的信息和特征来预测网格的高度。我尝试使用卷积层作为特征提取器,以提取与https://arxiv.org/abs/1504.06066
类似的图像部分的特征但是,即使我只有1032个可训练的参数,对于4k图像的数据集来说还是很低,但我的网络却无处可寻,并且总是过拟合。
这是我的网络的代码,它是Densenet的改编版: https://github.com/taki0112/Densenet-Tensorflow
def buildNetwork(self):
x = self.placeholders['imageInput']
# x = tf.expand_dims(x, 0)
stride = 1;
with tf.variable_scope("ImageNetwork_"):
x1 = self.conv_layer(x,7,2*self.growthRate,2,name="conv1");
x=self.denseBlock(x1,name="dense1",size=1);
x2=self.transitionLayer(x,name="transition1");
x=self.denseBlock(x2,name="dense2",size=1);
x3=self.transitionLayer(x,name="transition2");
vertexFeatures = self.extractFeatures(
features=[tf.squeeze(x1), tf.squeeze(x2), tf.squeeze(x3)]);
self.features = vertexFeatures;
mlp5 = self.fc_layer(vertexFeatures, 4, "_MLP5", True);
mlp6 = self.fc_layer(mlp5, 1, "_MLP6", False);
#tf.layers.dropout(mlp6, rate=Config.DROPOUT_RATIO, training=Config.IS_TRAINING);
mlp6 = tf.squeeze(mlp6);
这是我的特征提取器,它提取x1,x2和x3层给出的坐标的特征:
currentFeatures = features[c];
#Determine corordinate according to image size
shape = currentFeatures.get_shape().as_list();
batch = shape[0];
coordstf33 = [shape[1]/2,shape[1]/2];
#---------------------------------------------------------
coords33floorStacked = np.zeros((batch, coordShape, 3), dtype=np.float);
coords33floor = np.floor(coordstf33);
for j in range(batch):
for i in range(coordShape):
coords33floorStacked[j][i] = [j, coords33floor[0], coords33floor[1]];
coordstf33floortf = tf.convert_to_tensor(value=coords33floorStacked, dtype=tf.float32);
coordstf33floortf = tf.cast(coordstf33floortf, dtype=tf.int32);
#Extract images
v133 = tf.gather_nd(currentFeatures, [coordstf33floortf], name="_gather_{}33".format(i + 1));
finalTemp = tf.squeeze([v133]);
这是网络的另一部分,也可以在上面的GitHub链接中找到:
def bottleneckLayer(self,input,name):
with tf.variable_scope(name):
c = self.Batch_Normalization(input,training=Config.IS_TRAINING,name=name+"_Batch1");
c = tf.nn.relu(c);
c = self.conv_layer(c,1,4*self.growthRate,1,"_conv1");
c = tf.layers.dropout(c,rate = Config.DROPOUT_RATIO,training=Config.IS_TRAINING);
c = self.Batch_Normalization(c,training=Config.IS_TRAINING,name=name+"_Batch2");
c = tf.nn.relu(c);
c = self.conv_layer(c, 3, self.growthRate, 1, "_conv2");
c = tf.layers.dropout(c, rate=Config.DROPOUT_RATIO, training=Config.IS_TRAINING);
return c;
def transitionLayer(self,input,name):
with tf.variable_scope(name):
c = self.Batch_Normalization(input,training=Config.IS_TRAINING,name=name+"_Batch1");
c = tf.nn.relu(c);
inChannel = c.shape[-1];
features = inChannel.value;
if(features %2!=0):
features -=1;
c = self.conv_layer(c,1,0.5*int(features),1,name="_conv1");
c = tf.layers.dropout(c,rate=Config.DROPOUT_RATIO,training=Config.IS_TRAINING);
c = self.max_pool(c,2,2,name="_avgPool");
return c;
def denseBlock(self,input,name,size):
with tf.variable_scope(name):
x = self.bottleneckLayer(input,"_bottleneck" + str(0));
x = tf.concat([input,x],axis = 3);
for i in range(1,size):
k = self.bottleneckLayer(x,"_bottleneck" + str(i));
x = tf.concat([x,k],axis = 3);
return x;
现在,我只是尝试从尺寸为128x128的图像(范围从0到500)预测像素64x64的高度,但我将其标准化为0到1。我的验证损失似乎未超过2.71和准确性不管我的体重有多少,它总是低于20%,我怀疑问题应该出在其他地方。
我不得不提到,我从图像的那部分提取的特征数量仅为9,然后在进行预测之前将其传递给完全连接的图层。此外,我将MSE用作损失函数。 预先感谢!