神经网络的错误不会收敛

时间:2018-04-16 08:30:20

标签: tensorflow neural-network julia

我使用julia用TensorFlow创建神经网络。

我的网络运行但错误没有收敛,这是TensorBoard结果:

enter image description here

为了检查我的错误功能,我使用了Malmaud的教程,我更换了"准确性"通过我的功能。

有效:

enter image description here

之后,我认为我的网络存在问题。

你能帮助我吗?

这是我的代码:

@

修改

以下代码正在运行:

ENV["CUDA_VISIBLE_DEVICES"] = "0" # It is to use the gpu
using TensorFlow
using Distributions

sess = Session(Graph())

batch_size = 30  
num_pixels = 64

###########

# Data base: 1000 arrays, the first array is fill with 1, the second with 2 etc...

arrays_data = zeros(Float32,1000,num_pixels,num_pixels)

arrays_labels = zeros(Float32,1000)

for k in 1:num_pixels, j in 1:num_pixels, i in 1:1000
        arrays_data[i,j,k] = i
end

for i in 1:1000
    arrays_labels[i] = i
end

###########

# inputs

x = placeholder(Float32, shape= [batch_size, 1, 1, 1])

y = placeholder(Float32)

###########

 # Function to create a batch

function create_batch(batch_size)
    x = zeros(Float32, batch_size,num_pixels, num_pixels)
    y = zeros(Float32, batch_size)

index = shuffle(1:1000) # To choose a random batch

    for i in 1:batch_size
        x[i, : ,:] = arrays_data[index[i],:,:]

        y[i] = arrays_labels[index[i]]
    end
    y, x
end


###########


# Summary to use TensorBoard

 summary = TensorFlow.summary

# Create the different layers ; poids = weight

variable_scope("mymodel" * randstring(), initializer=Normal(0, .001)) do
    global poids_1 = get_variable("p1", [2,2,2,1], Float32)
    global poids_2 = get_variable("p2",[4,4,3,2],Float32)
    global poids_3 = get_variable("p3",[2,2,4,3],Float32)
    global poids_4 = get_variable("p4",[1,4,4,4],Float32)
    global poids_5 = get_variable("p5",[1,4,4,4],Float32)
    global poids_6 = get_variable("p6",[1,4,4,4],Float32)
    global biases_1 = get_variable("b1",[2],Float32)
    global biases_2 = get_variable("b2",[3],Float32)
    global biases_3 = get_variable("b3",[4],Float32)
    global biases_4 = get_variable("b4",[4],Float32)
    global biases_5 = get_variable("b5",[4],Float32)
    global biases_6 = get_variable("b6",[4],Float32)
end

logits_1 = nn.relu(nn.conv2d_transpose(x, poids_1, [batch_size,2,2,2], [1,2,2,1],padding = "SAME") + biases_1)

logits_2 = nn.relu(nn.conv2d_transpose(logits_1,poids_2, [batch_size,4,4,3], [1,2,2,1],padding = "SAME") + biases_2)

logits_3 = nn.relu(nn.conv2d_transpose(logits_2,poids_3, [batch_size,8,8,4], [1,2,2,1],padding = "SAME") + biases_3)

logits_4 = nn.relu(nn.conv2d_transpose(logits_3,poids_4, [batch_size,16,16,4], [1,2,2,1],padding = "SAME") + biases_4)

logits_5 = nn.relu(nn.conv2d_transpose(logits_4,poids_5, [batch_size,32,32,4], [1,2,2,1],padding = "SAME") + biases_5)

 logits_6 = nn.relu(nn.conv2d_transpose(logits_5,poids_6, [batch_size,64,64,4], [1,2,2,1],padding = "SAME") + biases_6)

logits_6 = reduce_sum(logits_6, axis=[4])



logits = reshape(logits_6, [batch_size,num_pixels*num_pixels])  # Output of network




smax = nn.softmax(logits)



cross_entropy = reduce_mean(-reduce_sum(y.*log(smax))) # loss function

optimizer = train.AdamOptimizer(0.0001)

train_op = train.minimize(optimizer,cross_entropy)

error = (1/(num_pixels*num_pixels*batch_size)).*sqrt(sum((smax - y)^2))

 summary.histogram("Error",error)

 merged = summary.merge_all()

run(sess, global_variables_initializer())

# summary_writer = summary.FileWriter("Folder Path") # If you want use TensorBoard

# Train loop

for i in 1:500

batch = create_batch(batch_size)


x_ = run(sess, train_op, Dict(x => reshape(batch[1], (batch_size,1,1,1)), y => reshape(batch[2], (batch_size,64*64))))


  if i%100 == 1

    err = run(sess, error, Dict(x => reshape(batch[1], (batch_size,1,1,1)), y => reshape(batch[2], (batch_size,64*64))))
       info("train $i , error = $err")
       end

# If you use TensorBoard, please use the following commands

      # new = run(sess,merged, Dict(x => reshape(batch[1], (batch_size,1,1,1)), y => reshape(batch[2], (batch_size,64*64))))

      # write(summary_writer, new, i)

end

close(sess)

2 个答案:

答案 0 :(得分:0)

是否需要将error定义为函数?

像: error(smax, y) = (1/(num_pixels*num_pixels*batch_size)).*sqrt(sum((smax - y)^2))

答案 1 :(得分:0)

我终于找到了解决这个问题的方法。

三个要点:

  • Malmaud教程将softmax应用于网络的输出,因为有不同的可能结果,必须选择最好的(具有更高概率)。在这种情况下,输出是图片,我们不必应用softmax;只需将输出与输入进行比较。

  • 对于损失函数,不需要交叉熵,选择最小二乘法。

  • 没有足够的数据只有64像素,所以256像素的数据库更好。

我在我的问题中添加了新代码。