通过使用LSTM的CNN,训练输出不在有效范围内

时间:2019-01-31 22:58:33

标签: tensorflow keras classification conv-neural-network lstm

我将tf作为后端使用keras。

模拟的目标是尝试使用地理空间时间序列数据集来构建分类器。目标Y标记在-1、0、1和2上,其中-1表示在该网格点处的测量数据,0表示质量良好的数据,1是中等质量的数据,而2是最差的数据。

现在,我有两个输入。我有一些大气表面变量,例如风,风速和雨水作为输入。并且,海洋表面变量(例如海面温度和海面盐度)作为第二输入。例如,所有数据集的维数应为(n_samples,n_timesteps,n_variables,n_xpoints:经度,n_ypoints:纬度)。目标数据集的3D尺寸如下所示:(n_samples,n_xpoints:经度,n_ypoints:纬度)。

此外,所有输入变量均通过其值范围进行归一化。例如,将海面电流速度从(-2,2)[m / s]归一化为(-1,1),并且将表面风速以(-1,1)归一化。从(-20,20)[m / s]。

模型配置如下所述。

def cnn():
   model = Sequential()
   model.add( Conv2D(64, (3,3), activation='relu',
           data_format='channels_first', kernel_initializer='he_normal',
           name='conv1') )
   model.add( MaxPooling2D(pool_size=(2, 2), strides = (2,2)))
   model.add( BatchNormalization() )

   model.add( Conv2D(32, (3,3), activation='relu',
           kernel_initializer='he_normal', data_format='channels_first',
           name='conv2') )
   model.add( MaxPooling2D(pool_size=(2, 2), strides = (2,2)))
   model.add( Dropout(0.2) )
   model.add( BatchNormalization() )
   model.add( Activation('relu') )

   model.add( MaxPooling2D(pool_size=(2, 2), strides = (2,2)))
   model.add( Flatten() )
   model.add( Dense(128,activation='relu') )
   return model

def cnn2lstm(Input_shape, premo, name):
   branch_in = Input(shape=Input_shape, dtype='float32')
   model = TimeDistributed(premo)(branch_in)
   model = LSTM(256, return_sequences=True, name=name+'_lstm1')(model)
   model = TimeDistributed(Dense(4096, activation='relu'))(model)
   model = TimeDistributed(Dropout(0.3))(model)
   model = LSTM(256, return_sequences = True, name=name+'_lstm2')(model)
   model = Dense(101, activation='sigmoid')(model)
   model = Dropout(0.3)(model)
   return branch_in, model

atm_in, atm = cnn2lstm(Train_atm.shape[1:], cnn(),'atm')
ocn_in, ocn = cnn2lstm(Train_ocn.shape[1:], cnn(),'ocn')

#--- two inputs into one output
x = keras.layers.concatenate([atm,ocn],axis=1)   
x = LSTM(150,return_sequences=True)(x) 
x = Dropout(0.2)(x) 
x = LSTM(200,return_sequences=True)(x) 
x = Dropout(0.2)(x) 
x = LSTM(500)(x) 
x = Dense(1001,activation='relu')(x)   
x = Dense(2001,activation='relu')(x)   
x = Dense(2501,activation='tanh')(x) 
x = Dense(2701,activation='relu')(x) 
x = Dense(3355,activation='softmax')(x) 
x = Reshape((61,55),input_shape=(3355,))(x) 
model2 = Model(inputs=[atm_in, ocn_in, bio_in], outputs=x) 
plot_model(model2, show_shapes = True, to_file='model_way4_2.png') 
model2.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy']) 

filepath='ways4_model02_best.hdf5'
checkpoint = ModelCheckpoint(filepath,monitor='val_acc', verbose=1,save_best_only=True,mode='max')
callbacks_list = [checkpoint]
hist = model2.fit([Train_atm, Train_ocn, Train_bio], Train_Y, 
                        epochs=150, batch_size=3, validation_split=0.1, 
                        shuffle=True, callbacks=callbacks_list, verbose=0)           

scores = model2.evaluate([Train_atm, Train_ocn, Train_bio], Train_Y)
print("MODEL 2 %s: %.2f%%" % (model2.metrics_names[1], scores[1]*100))

此处的评估分数大多为83%或更高。但是model2.predict的输出值并没有像我的目标数据集那样给我有效范围。相反,模型输出为我提供了从0到1(0,1)的值,其模式与目标数据集显示的模式相似。

谁能告诉我我的DL算法有什么大问题?

0 个答案:

没有答案