我对Keras和深度学习有点新鲜。我目前正在尝试复制此paper但是当我编译第二个模型(使用LSTM)时,我收到以下错误:
"TypeError: unsupported operand type(s) for +: 'NoneType' and 'int'"
模型的描述如下:
T
是设备特定的窗口大小)size
3,5和7并行1D卷积
分别为stride=1
,number of filters=32
,
activation type=linear
,border mode=same
output_dim=128
output_dim=128
output_dim=128
,activation type=ReLU
output_dim= T
,activation type=linear
我的代码是:
from keras import layers, Input
from keras.models import Model
def lstm_net(T):
input_layer = Input(shape=(T,1))
branch_a = layers.Conv1D(32, 3, activation='linear', padding='same', strides=1)(input_layer)
branch_b = layers.Conv1D(32, 5, activation='linear', padding='same', strides=1)(input_layer)
branch_c = layers.Conv1D(32, 7, activation='linear', padding='same', strides=1)(input_layer)
merge_layer = layers.Concatenate(axis=-1)([branch_a, branch_b, branch_c])
print(merge_layer.shape)
BLSTM1 = layers.Bidirectional(layers.LSTM(128, input_shape=(8,40,96)))(merge_layer)
print(BLSTM1.shape)
BLSTM2 = layers.Bidirectional(layers.LSTM(128))(BLSTM1)
dense_layer = layers.Dense(128, activation='relu')(BLSTM2)
output_dense = layers.Dense(1, activation='linear')(dense_layer)
model = Model(input_layer, output_dense)
model.name = "lstm_net"
return model
model = lstm_net(40)
之后我得到了上述错误。我的目标是给出一批长度为40的8个序列作为输入,并获得一批8个长度为40的序列作为输出。我在Keras Github LSTM layer cannot connect to Dense layer after Flatten #818上发现了这个问题,@ fchollet建议我应该指定' input_shape'在我做的第一层,但可能不正确。我把两个打印语句看成形状如何变化,输出是:
(?, 40, 96)
(?, 256)
错误发生在BLSTM2定义的行上,可以完整地看到here
答案 0 :(得分:1)
Your problem lies in these three lines:
BLSTM1 = layers.Bidirectional(layers.LSTM(128, input_shape=(8,40,96)))(merge_layer)
print(BLSTM1.shape)
BLSTM2 = layers.Bidirectional(layers.LSTM(128))(BLSTM1)
As a default, LSTM
is returning only the last element of computations - so your data is losing its sequential nature. That's why the proceeding layer raises an error. Change this line to:
BLSTM1 = layers.Bidirectional(layers.LSTM(128, return_sequences=True))(merge_layer)
print(BLSTM1.shape)
BLSTM2 = layers.Bidirectional(layers.LSTM(128))(BLSTM1)
In order to make the input to the second LSTM
to have sequential nature also.
Aside of this - I'd rather not use input_shape
in middle model layer as it's automatically inferred.