Question

我想使用带有张量流后端的Keras序列模型制作RNN。当我实现以下代码时：

batch_size = 8
batch_inputshape = (batch_size,x_train.shape[1],x_train.shape[2])
print(batch_inputshape) #(8, 600, 103)

model = Sequential()
model.add(LSTM(103, 
               batch_input_shape = batch_inputshape, 
               return_sequences = True,
              stateful = True))
model.add(Dropout(0.2))

model.add(LSTM(50, 
               return_sequences = True,
              stateful = True))
model.add(Dropout(0.2))


model.add(TimeDistributed(Dense(10)))
model.add(TimeDistributed(Dense(2)))
model.add(Activation('softmax'))
model.compile(loss= ncce, optimizer='adam')    

print (model.output_shape) #(8, 600, 2)

model.fit(x_train,y_train, batch_size = batch_size,
                           nb_epoch = 1, validation_split=0.25)

我收到以下错误消息：

重塑的输入是一个有16个值的张量，但请求的形状有8

但无论我将batch_size更改为错误，只需遵循以下公式：

重塑的输入是具有2 * batch_size值的张量，但请求的形状为batch_size

我看过其他Q&A，但我认为它们对我帮助不大。或者我不太了解答案。

非常感谢任何帮助！

编辑：按要求输入和目标的形状：

print(x_train.shape) #(512,600,103)
print(y_train.shape) #(512,600,2)

编辑2：

from functools import partial
import keras.backend as K 
from itertools import product

def w_categorical_crossentropy(y_true, y_pred, weights):
    # https://github.com/fchollet/keras/issues/2115#issuecomment-274101310 #
    nb_cl = len(weights)
    final_mask = K.zeros_like(y_pred[:, 0])
    y_pred_max = K.max(y_pred, axis=1)
    y_pred_max = K.reshape(y_pred_max, (K.shape(y_pred)[0], 1))
    y_pred_max_mat = K.cast(K.equal(y_pred, y_pred_max), K.floatx())
    for c_p, c_t in product(range(nb_cl), range(nb_cl)):
        final_mask += (weights[c_t, c_p] * y_pred_max_mat[:, c_p] * y_true[:, c_t])
    return K.categorical_crossentropy(y_pred, y_true) * final_mask

w_array = np.ones((2,2))
w_array[1, 0] = 100


print(w_array)
ncce = partial(w_categorical_crossentropy, weights=w_array)
ncce.__name__ ='w_categorical_crossentropy

编辑3：更新

在@Nassim Ben的帮助下，他发现问题出现在损失函数中。他发布了具有常规损失功能的代码，然后它工作得很好。但是使用自定义丢失功能代码不起作用。正如这个问题的任何读者都可以看到我在上面发布了我的costum loss函数并且存在问题。目前我还不知道为什么会出现这种错误，但这是当前状态。

Answer 1

编辑：这段代码对我有用，我只是为了简单而改变了损失。

import keras
from keras.layers import *
from keras.models import Sequential
from keras.objectives import *
import numpy as np

x_train = np.random.random((512,600, 103))
y_train = np.random.random((512,600,2))
batch_size = 8
batch_inputshape = (batch_size,x_train.shape[1],x_train.shape[2]) 
print(batch_inputshape) #(8, 600, 103)

model = Sequential()
model.add(LSTM(103,
           batch_input_shape = batch_inputshape,
           return_sequences = True,
          stateful = True))
model.add(Dropout(0.2))
model.add(LSTM(50,
           return_sequences = True,
          stateful = True))
model.add(Dropout(0.2))


model.add(TimeDistributed(Dense(10)))
model.add(TimeDistributed(Dense(2)))
model.add(Activation('softmax'))
model.compile(loss= "mse", optimizer='adam')

print (model.output_shape) #(8, 600, 2)

model.fit(x_train,y_train, batch_size = batch_size,
                       nb_epoch = 1, validation_split=0.25)

编辑2：

所以错误来自损失函数。在你从github复制的用于ncce丢失的代码中，它们具有形状输出（批处理，10）。你有形状的输出（批次，600,2）。所以这是我对函数的编辑：

def w_categorical_crossentropy(y_true, y_pred, weights):
# https://github.com/fchollet/keras/issues/2115#issuecomment-274101310 #
    nb_cl = len(weights)
    # Create a mask with zeroes
    final_mask = K.zeros_like(y_pred[:,:,0])
    # get the maximum probability value for every output (shape = (batch,600,1))
    y_pred_max = K.max(y_pred, axis=2, keepdims=True)
    # Get the actual predictions for every output (shape = (batch,600,2))
    # This K.equal uses broadcasting, we compare two tensors of different sizes but it works (magic)
    y_pred_max_mat = K.equal(y_pred, y_pred_max)
    for c_p, c_t in product(range(nb_cl), range(nb_cl)):
        # Create the mask of weights to apply to the result of the cat_crossentropy
        final_mask += (weights[c_t, c_p] * K.cast(y_pred_max_mat[:,:, c_p], K.floatx()) * y_true[:,:, c_t])
    return K.categorical_crossentropy(y_pred, y_true) * final_mask

w_array = np.ones((2,2))
w_array[1, 0] = 100

正如您所看到的，我只是因为您的特殊形状而修改了索引播放。面具必须是形状（批次，600）。最大值必须在第三维上完成，因为存在您想要输出的概率。由于张量的形状，矩阵乘法也需要更新。

这应该有用。

如果您需要更详细的解释，请随时询问： - ）

重塑的输入是一个张量为2 *＆＃34; batch_size＆＃34;值，但请求的形状有＆＃34; batch_size＆＃34;

1 个答案: