我正在尝试训练一个接收两个输入的模型,将它们连接起来,然后将结果输入LSTM。最后一层是Dense()调用,目标是二进制向量(多于一个1)。任务是分类。
我的输入序列是50行23个时间步长,5625个特征(public class MainMenu {
private Bitmap playButton;
public MainMenu (Context context) {
playButton = BitmapFactory.decodeResource(context.getResources(), R.drawable.play_button);
float aspectRatio = 421/475;
int playButtonWidth = MainActivity.screenX / 5;
int playButtonHeight = (int)(MainActivity.screenX / (aspectRatio));
playButton = Bitmap.createScaledBitmap(playButton, playButtonWidth, playButtonHeight, false);
}
),我的补充输入(不是真正的序列)是50个单热行,长度为23(x_train
)< / p>
我得到的错误是:
total_hours
我的代码是:
ValueError: Error when checking target: expected dense_1 to have shape (1, 5625)
but got array with shape (5625, 1)
我的import numpy as np
from keras.layers import LSTM, Dense, Input, Concatenate
from keras.models import Model
#CREATING DUMMY INPUT
hours_input_1 = np.eye(23)
hours_input_2 = np.eye(23)
hours_input_3 = np.pad(np.eye(4), pad_width=((0, 19), (0, 19)), mode='constant')
hours_input_3 = hours_input_3[:4,]
total_hours = np.vstack((hours_input_1, hours_input_2, hours_input_3))
seq_input = np.random.normal(size=(50, 24, 5625))
y_train = np.array([seq_input[i, -1, :] for i in range(50)])
x_train = np.array([seq_input[i, :-1, :] for i in range(50)])
#print 'total_hours', total_hours.shape #(50, 23)
#print 'x_train', x_train.shape #(50, 23, 5625)
#print 'y_train shape', y_train.shape #(50, 5625)
#MODEL DEFINITION
seq_model_in = Input(shape=(1,), batch_shape=(1, 1, 5625))
hours_model_in = Input(shape=(1,), batch_shape=(1, 1, 1))
merged = Concatenate(axis=-1)([seq_model_in, hours_model_in])
#print merged.shape #(1, 1, 5626) = added the 'hour' on as an extra feature
merged_lstm = LSTM(10, batch_input_shape=(1, 1, 5625), return_sequences=False, stateful=True)(merged)
merged_dense = Dense(5625, activation='sigmoid')(merged_lstm)
model = Model(inputs=[seq_model_in, hours_model_in], outputs=merged_dense)
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
#TRAINING
for epoch in range(10):
for i in range(50):
y_true = y_train[i,:]
for j in range(23):
input_1 = np.expand_dims(np.expand_dims(x_train[i][j], axis=1), axis=1)
input_1 = np.reshape(input_1, (1, 1, x_train.shape[2]))
input_2 = np.expand_dims(np.expand_dims(np.array([total_hours[i][j]]), axis=1), axis=1)
tr_loss, tr_acc = model.train_on_batch([input_1, input_2], y_true)#np.array([y_true]))
model.reset_states()
看起来像这样:
model.summary()
我正在使用带有TensorFlow后端的Keras版本2.1.2(TensorFlow版本1.4.0。如何解决ValueError?
答案 0 :(得分:0)
事实证明,我需要解决目标,正如ValueError暗示的那样。
如果替换:
y_true = y_train[i,:]
使用:
y_true_1 = np.expand_dims(y_train[i,:], axis=1)
y_true = np.swapaxes(y_true_1, 0, 1)
代码运行。