卷积自动编码器以分析长一维序列

时间:2018-08-04 13:57:52

标签: keras dimensions autoencoder dimensionality-reduction

我有一个一维向量的数据集,每个向量的长度为3001位。我使用了一个简单的卷积网络对这些序列执行二进制分类:

shape=train_X.shape[1:]
model = Sequential()
model.add(Conv1D(75,3,strides=1, input_shape=shape, activation='relu'))
model.add(MaxPooling1D(3))
model.add(Flatten())
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy'])

该网络可达到约60%的精度。 我现在想创建一个自动编码器,以发现区分标签为“ 1”和标签为“ 0”的样本的常规模式。即生成示例性序列-代表标记为“ 1”的样本。

根据以前的博客和帖子,我尝试将可以实现此目的的自动编码器放在一起:

input_sig = Input(batch_shape=(None,3001,1))
x = Conv1D(64,3, activation='relu', padding='same')(input_sig)
x1 = MaxPooling1D(2)(x)
x2 = Conv1D(32,3, activation='relu', padding='same')(x1)
x3 = MaxPooling1D(2)(x2)
flat = Flatten()(x3)
encoded = Dense(1,activation = 'relu')(flat)
x2_ = Conv1D(32, 3, activation='relu', padding='same')(x3)
x1_ = UpSampling1D(2)(x2_)
x_ = Conv1D(64, 3, activation='relu', padding='same')(x1_)
upsamp = UpSampling1D(2)(x_)
decoded = Conv1D(1, 3, activation='sigmoid', padding='same')(upsamp)
autoencoder = Model(input_sig, decoded)
autoencoder.compile(optimizer='adam', loss='mse', metrics=['accuracy'])

如下所示:

autoencoder.summary()
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_57 (InputLayer)        (None, 3001, 1)           0         
_________________________________________________________________
conv1d_233 (Conv1D)          (None, 3001, 64)          256       
_________________________________________________________________
max_pooling1d_115 (MaxPoolin (None, 1500, 64)          0         
_________________________________________________________________
conv1d_234 (Conv1D)          (None, 1500, 32)          6176      
_________________________________________________________________
max_pooling1d_116 (MaxPoolin (None, 750, 32)           0         
_________________________________________________________________
conv1d_235 (Conv1D)          (None, 750, 32)           3104      
_________________________________________________________________
up_sampling1d_106 (UpSamplin (None, 1500, 32)          0         
_________________________________________________________________
conv1d_236 (Conv1D)          (None, 1500, 64)          6208      
_________________________________________________________________
up_sampling1d_107 (UpSamplin (None, 3000, 64)          0         
_________________________________________________________________
conv1d_237 (Conv1D)          (None, 3000, 64)          12352     
=================================================================
Total params: 28,096
Trainable params: 28,096
Non-trainable params: 0

因此,直到我训练网络后,一切似乎都能顺利进行

autoencoder.fit(train_X,train_y,epochs=3,batch_size=100,validation_data=(test_X, test_y))

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/bsxcto/miniconda3/lib/python3.6/site-packages/keras/engine/training.py", line 1630, in fit
    batch_size=batch_size)
  File "/home/bsxcto/miniconda3/lib/python3.6/site-packages/keras/engine/training.py", line 1480, in _standardize_user_data
    exception_prefix='target')
  File "/home/bsxcto/miniconda3/lib/python3.6/site-packages/keras/engine/training.py", line 113, in _standardize_input_data
    'with shape ' + str(data_shape))
ValueError: Error when checking target: expected conv1d_237 to have 3 dimensions, but got array with shape (32318, 1)

因此,我尝试在最后一层之前添加一个'Reshape'层。

upsamp = UpSampling1D(2)(x_)
flat = Flatten()(upsamp)
reshaped = Reshape((3000,64))(flat)
decoded = Conv1D(1, 3, activation='sigmoid', padding='same')(reshaped)

在这种情况下,网络如下所示:

autoencoder.summary()
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_59 (InputLayer)        (None, 3001, 1)           0         
_________________________________________________________________
conv1d_243 (Conv1D)          (None, 3001, 64)          256       
_________________________________________________________________
max_pooling1d_119 (MaxPoolin (None, 1500, 64)          0         
_________________________________________________________________
conv1d_244 (Conv1D)          (None, 1500, 32)          6176      
_________________________________________________________________
max_pooling1d_120 (MaxPoolin (None, 750, 32)           0         
_________________________________________________________________
conv1d_245 (Conv1D)          (None, 750, 32)           3104      
_________________________________________________________________
up_sampling1d_110 (UpSamplin (None, 1500, 32)          0         
_________________________________________________________________
conv1d_246 (Conv1D)          (None, 1500, 64)          6208      
_________________________________________________________________
up_sampling1d_111 (UpSamplin (None, 3000, 64)          0         
_________________________________________________________________
flatten_111 (Flatten)        (None, 192000)            0         
_________________________________________________________________
reshape_45 (Reshape)         (None, 3000, 64)          0         
_________________________________________________________________
conv1d_247 (Conv1D)          (None, 3000, 1)           193       
=================================================================
Total params: 15,937
Trainable params: 15,937
Non-trainable params: 0

但是同样的错误结果:

Error when checking target: expected conv1d_247 to have 3 dimensions, but got array with shape (32318, 1)

我的问题是:

1)这是找到区分带有标签“ 1”和“ 0”的样本的模式的可行方法吗?

2)如何使最终层接受最后一个上采样层的最终输出?

1 个答案:

答案 0 :(得分:0)

original = Sequential() 
original.add(Conv1D(75,repeat_length,strides=stride, input_shape=shape, activation='relu’,padding=‘same’))
original.add(MaxPooling1D(repeat_length))   
original.add(Flatten()) 
original.add(Dense(1, activation='sigmoid'))    
original.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy']) 
calculate_roc(original......)

mod=Sequential()
mod.add(original.layers[0])
mod.add(original.layers[1]) 
mod.add(Conv1D(75,window, activation='relu', padding='same'))   
mod.add(UpSampling1D(window))   
mod.add(Conv1D(1, 1, activation='sigmoid', padding='same'))
mod.compile(optimizer='adam', loss='mse', metrics=['accuracy']) 
mod.layers[0].trainable=False   
mod.layers[1].trainable=False   
mod.fit(train_X,train_X,epochs=1,batch_size=100)
decoded_imgs = mod.predict(test_X)  
x=decoded_imgs.mean(axis=0) 
plt.plot(x)