如何使用`sparse_softmax_cross_entropy_with_logits`:没有获得不兼容的形状错误

时间:2017-01-30 13:22:02

标签: tensorflow julia

我想使用sparse_softmax_cross_entropy_with_logits 与朱莉娅TensorFlow包装。

操作在code here中定义。 基本上,正如我所理解的那样,第一个参数应该是logits,通常会被提供给softmax以使它们成为类别概率(~1hot输出)。 第二个应该是正确的标签作为标签ID。

我调整了TensorFlow.jl readme的示例代码 见下文:

using Distributions
using TensorFlow

# Generate some synthetic data
x = randn(100, 50)
w = randn(50, 10)
y_prob = exp(x*w)
y_prob ./= sum(y_prob,2)

function draw(probs)
    y = zeros(size(probs))
    for i in 1:size(probs, 1)
        idx = rand(Categorical(probs[i, :]))
        y[i, idx] = 1
    end
    return y
end

y = draw(y_prob)

# Build the model
sess = Session(Graph())
X = placeholder(Float64)
Y_obs = placeholder(Float64)
Y_obs_lbl = indmax(Y_obs, 2)
variable_scope("logisitic_model", initializer=Normal(0, .001)) do
    global W = get_variable("weights", [50, 10], Float64)
    global B = get_variable("bias", [10], Float64)
end

L = X*W + B
Y=nn.softmax(L)
#costs = log(Y).*Y_obs #Dense (Orginal) way
costs = nn.sparse_softmax_cross_entropy_with_logits(L, Y_obs_lbl+1) #sparse way
Loss = -reduce_sum(costs)
optimizer = train.AdamOptimizer()
minimize_op = train.minimize(optimizer, Loss)
saver = train.Saver()
# Run training
run(sess, initialize_all_variables())
cur_loss, _ = run(sess, [Loss, minimize_op], Dict(X=>x, Y_obs=>y))

但是当我运行它时,我收到一个错误:

Tensorflow error: Status: Incompatible shapes: [1,100] vs. [100,10]
     [[Node: gradients/SparseSoftmaxCrossEntropyWithLogits_10_grad/mul = Mul[T=DT_DOUBLE, _class=[], _device="/job:localhost/replica:0/task:0/cpu:0"](gradients/SparseSoftmaxCrossEntropyWithLogits_10_grad/ExpandDims, SparseSoftmaxCrossEntropyWithLogits_10:1)]]


 in check_status(::TensorFlow.Status) at /home/ubuntu/.julia/v0.5/TensorFlow/src/core.jl:101
 in run(::TensorFlow.Session, ::Array{TensorFlow.Port,1}, ::Array{Any,1}, ::Array{TensorFlow.Port,1}, ::Array{Ptr{Void},1}) at /home/ubuntu/.julia/v0.5/TensorFlow/src/run.jl:96
 in run(::TensorFlow.Session, ::Array{TensorFlow.Tensor,1}, ::Dict{TensorFlow.Tensor,Array{Float64,2}}) at /home/ubuntu/.julia/v0.5/TensorFlow/src/run.jl:143

这只有在我尝试训练它时才会发生。 如果我不包括优化函数/输出,那么它工作正常。 所以我正在做一些搞砸了渐变数学的东西。

0 个答案:

没有答案