Keras CNN只收敛于一个分类

时间:2017-01-13 01:07:41

标签: audio tensorflow neural-network keras conv-neural-network

我已经成功地在Keras上构建和训练标准的4层MLP神经网络,用于音频分类任务。然而,当我尝试将卷积层添加到网络时,CNN仅成功训练了大约25%的时间。在其他情况下,在测试时,网络将始终预测单个类,为10类分类问题提供10%的准确度

我不确定网络是否完全没有训练,是否会收集到一些局部最小值,或者它是否完全不同。

我的输入数据是长度为5秒的音频信号,它们已被转换为频谱图并展平为矢量。有10个类,MLP模型能够识别正确的类,准确度达到~55%(不是很好,但远比随机猜测好得多)。

到目前为止我收到的失败输出看起来像这样......

(5500, 'train samples')
(100, 'test samples')
____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
====================================================================================================
convolution2d_1 (Convolution2D)  (None, 127, 198, 64)  640         convolution2d_input_1[0][0]      
____________________________________________________________________________________________________
activation_1 (Activation)        (None, 127, 198, 64)  0           convolution2d_1[0][0]            
____________________________________________________________________________________________________
convolution2d_2 (Convolution2D)  (None, 125, 196, 64)  36928       activation_1[0][0]               
____________________________________________________________________________________________________
activation_2 (Activation)        (None, 125, 196, 64)  0           convolution2d_2[0][0]            
____________________________________________________________________________________________________
maxpooling2d_1 (MaxPooling2D)    (None, 62, 98, 64)    0           activation_2[0][0]               
____________________________________________________________________________________________________
dropout_1 (Dropout)              (None, 62, 98, 64)    0           maxpooling2d_1[0][0]             
____________________________________________________________________________________________________
flatten_1 (Flatten)              (None, 388864)        0           dropout_1[0][0]                  
____________________________________________________________________________________________________
dense_1 (Dense)                  (None, 128)           49774720    flatten_1[0][0]                  
____________________________________________________________________________________________________
activation_3 (Activation)        (None, 128)           0           dense_1[0][0]                    
____________________________________________________________________________________________________
dropout_2 (Dropout)              (None, 128)           0           activation_3[0][0]               
____________________________________________________________________________________________________
dense_2 (Dense)                  (None, 10)            1290        dropout_2[0][0]                  
____________________________________________________________________________________________________
activation_4 (Activation)        (None, 10)            0           dense_2[0][0]                    
====================================================================================================
Total params: 49813578
____________________________________________________________________________________________________
Train on 5500 samples, validate on 100 samples
Epoch 1/100
5500/5500 [==============================] - 72s - loss: 2.3067 - acc: 0.0929 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 2/100
5500/5500 [==============================] - 69s - loss: 2.3028 - acc: 0.0958 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 3/100
5500/5500 [==============================] - 69s - loss: 2.3028 - acc: 0.0875 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 4/100
5500/5500 [==============================] - 69s - loss: 2.3027 - acc: 0.0882 - val_loss: 2.3026 - val_acc: 0.1000

...


Epoch 99/100
5500/5500 [==============================] - 69s - loss: 2.3027 - acc: 0.0865 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 100/100
5500/5500 [==============================] - 69s - loss: 2.3027 - acc: 0.0920 - val_loss: 2.3026 - val_acc: 0.1000

('Test score:', 2.3025851917266844)
('Test accuracy:', 0.10000000000000001)

 32/100 [========>.....................] - ETA: 1s
 64/100 [==================>...........] - ETA: 0s
 96/100 [===========================>..] - ETA: 0s
100/100 [==============================] - ETA: 0s     
[5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5]

正如你所看到的,网络只预测了第5类。除此之外,损失实际上并没有减少。

相比之下,成功的运行看起来像这样......

    (5500, 'train samples')
    (100, 'test samples')

    ____________________________________________________________________________________________________
    Layer (type)                     Output Shape          Param #     Connected to                     
    ====================================================================================================
    convolution2d_1 (Convolution2D)  (None, 126, 197, 64)  1088        convolution2d_input_1[0][0]      
    ____________________________________________________________________________________________________
    activation_1 (Activation)        (None, 126, 197, 64)  0           convolution2d_1[0][0]            
    ____________________________________________________________________________________________________
    convolution2d_2 (Convolution2D)  (None, 123, 194, 64)  65600       activation_1[0][0]               
    ____________________________________________________________________________________________________
    activation_2 (Activation)        (None, 123, 194, 64)  0           convolution2d_2[0][0]            
    ____________________________________________________________________________________________________
    maxpooling2d_1 (MaxPooling2D)    (None, 61, 97, 64)    0           activation_2[0][0]               
    ____________________________________________________________________________________________________
    dropout_1 (Dropout)              (None, 61, 97, 64)    0           maxpooling2d_1[0][0]             
    ____________________________________________________________________________________________________
    flatten_1 (Flatten)              (None, 378688)        0           dropout_1[0][0]                  
    ____________________________________________________________________________________________________
    dense_1 (Dense)                  (None, 128)           48472192    flatten_1[0][0]                  
    ____________________________________________________________________________________________________
    activation_3 (Activation)        (None, 128)           0           dense_1[0][0]                    
    ____________________________________________________________________________________________________
    dropout_2 (Dropout)              (None, 128)           0           activation_3[0][0]               
    ____________________________________________________________________________________________________
    dense_2 (Dense)                  (None, 10)            1290        dropout_2[0][0]                  
    ____________________________________________________________________________________________________
    activation_4 (Activation)        (None, 10)            0           dense_2[0][0]                    
    ====================================================================================================
    Total params: 48540170
    ____________________________________________________________________________________________________
    Train on 5500 samples, validate on 100 samples
    Epoch 1/100
    5500/5500 [==============================] - 82s - loss: 2.3210 - acc: 0.0976 - val_loss: 2.3026 - val_acc: 0.1000
    Epoch 2/100
    5500/5500 [==============================] - 78s - loss: 2.3034 - acc: 0.0967 - val_loss: 2.3026 - val_acc: 0.1000
    Epoch 3/100
    5500/5500 [==============================] - 78s - loss: 2.2284 - acc: 0.1631 - val_loss: 1.9165 - val_acc: 0.3800

    ...

    Epoch 99/100
    5500/5500 [==============================] - 79s - loss: 0.0169 - acc: 0.9933 - val_loss: 3.7371 - val_acc: 0.5400
    Epoch 100/100
    5500/5500 [==============================] - 79s - loss: 0.0207 - acc: 0.9918 - val_loss: 3.1487 - val_acc: 0.5700

    ('Test score:', 3.1486789989471435)
('Test accuracy:', 0.56999999999999995)

 32/100 [========>.....................] - ETA: 0s
 64/100 [==================>...........] - ETA: 0s
 96/100 [===========================>..] - ETA: 0s
100/100 [==============================] - 0s     
[0 2 5 5 1 9 5 6 9 3 0 7 8 2 6 7 6 4 1 6 2 5 1 7 0 0 0 8 4 5 1 0 4 5 2 9 9
 2 6 0 2 0 4 1 3 0 5 4 4 4 8 7 6 5 9 1 5 7 9 9 2 9 7 1 1 8 5 8 6 4 1 4 9 5
 6 5 4 0 9 4 1 5 6 9 4 4 8 2 9 5 0 0 1 9 4 5 9 5 4 0]

 32/100 [========>.....................] - ETA: 0s
 64/100 [==================>...........] - ETA: 0s
 96/100 [===========================>..] - ETA: 0s
100/100 [==============================] - 0s     
             precision    recall  f1-score   support

     Class 0       0.77      1.00      0.87        10
     Class 1       0.91      1.00      0.95        10
     Class 2       0.88      0.70      0.78        10
     Class 3       0.50      0.10      0.17        10
     Class 4       0.20      0.30      0.24        10
     Class 5       0.56      0.90      0.69        10
     Class 6       0.67      0.60      0.63        10
     Class 7       0.33      0.20      0.25        10
     Class 8       0.50      0.30      0.37        10
     Class 9       0.43      0.60      0.50        10

avg / total       0.57      0.57      0.55       100

[[10  0  0  0  0  0  0  0  0  0]
 [ 0 10  0  0  0  0  0  0  0  0]
 [ 0  0  7  0  0  3  0  0  0  0]
 [ 2  0  0  1  1  0  0  0  0  6]
 [ 0  0  0  1  3  0  1  3  2  0]
 [ 0  0  0  0  0  9  0  0  0  1]
 [ 1  0  1  0  0  0  6  1  1  0]
 [ 0  0  0  0  6  1  0  2  0  1]
 [ 0  1  0  0  5  0  1  0  3  0]
 [ 0  0  0  0  0  3  1  0  0  6]]

这是我一直在运行的网络

import numpy as np
import os, csv, gc, time, random
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, MaxPooling2D
from keras.optimizers import SGD, Adam, RMSprop
from keras.utils import np_utils
from keras import backend as K
from keras.utils import np_utils
from sklearn.metrics import classification_report, confusion_matrix
import matplotlib.pyplot as plt

nb_classes = 10
batch_size = 32
nb_epoch = 100
img_rows, img_cols = 129, 200
size_of_samples = img_rows * img_cols
samplesDir = './samples'
trainFrom = 'Train'
testFrom = 'Test'


def parallel_shuffle(a, b):
    assert len(a) == len(b)
    p = np.random.permutation(len(a))
    return a[p], b[p]


def populate_file_arrays():
    train_files = []
    test_files = []
    sample_num = 0
    while len(train_files) < nb_classes and sample_num < 30:
        file_name = os.path.join(samplesDir, trainFrom, "sample" + str(sample_num) + ".txt")
        # print("trying {}".format(file_name))
        if os.path.isfile(file_name):
            train_files.append(file_name)
            test_files.append(os.path.join(samplesDir, testFrom, "sample" + str(sample_num) + ".txt"))
            # print("{} worked".format(sample_num))
            sample_num += 1
        else:
            sample_num += 1
    assert(train_files and test_files)
    return train_files, test_files


def sample_handling(sample):
    print("Opening", sample)
    featureset = np.array(list(csv.reader(open(sample, "r"), delimiter=','))).astype('float')
    return featureset


def create_feature_sets_and_labels(train_set, test_set):
    print("Creating featuresets")
    sample_class = 0
    for samples in train_set:

        if sample_class == 0:
            train_x = sample_handling(samples)
            train_y = np.empty(len(train_x), dtype=int)
            train_y.fill(sample_class)
        else:
            new_sample = sample_handling(samples)
            train_x = np.concatenate((train_x, new_sample), axis=0)
            classification = np.empty(len(new_sample), dtype=int)
            classification.fill(sample_class)
            train_y = np.append(train_y, classification)
        sample_class += 1

    sample_class = 0
    for samples in test_set:
        if sample_class == 0:
            test_x = sample_handling(samples)
            test_y = np.empty(len(test_x), dtype=int)
            test_y.fill(sample_class)
        else:
            new_sample = sample_handling(samples)
            test_x = np.concatenate((test_x, new_sample), axis=0)
            classification = np.empty(len(new_sample), dtype=int)
            classification.fill(sample_class)
            test_y = np.append(test_y, classification)
        sample_class += 1

    train_x, train_y = parallel_shuffle(train_x, train_y)
    test_x, test_y = parallel_shuffle(test_x, test_y)

    return (train_x, train_y), (test_x, test_y)

trainFiles, testFiles = populate_file_arrays()

(X_train, Y_train), (X_test, Y_test) = create_feature_sets_and_labels(trainFiles, testFiles)
# (X_train, Y_train), (X_test, Y_test) = mnist.load_data()



# number of convolutional filters to use
nb_filters = 64
# size of pooling area for max pooling
pool_size = (2, 2)
# convolution kernel size
kernel_size = (4, 4)

# the data, shuffled and split between train and test sets
# (X_train, y_train), (X_test, y_test) = mnist.load_data()

X_train = X_train.reshape(X_train.shape[0], img_rows, img_cols, 1)
X_test = X_test.reshape(X_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)


# X_train = X_train.reshape(-1, size_of_samples)
# X_test = X_test.reshape(-1, size_of_samples)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')

minVal = np.amin(X_train)
X_train -= minVal
X_test -= minVal
maxVal = np.amax(X_train)
X_train /= maxVal
X_test /= maxVal

print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')
print(np.amin(X_train), np.amax(X_train))
print(np.amin(X_test), np.amax(X_test))

Y_train = np_utils.to_categorical(Y_train, nb_classes)
Y_test = np_utils.to_categorical(Y_test, nb_classes)

print(X_train.shape)
print(Y_train.shape)
print(X_test.shape)
print(Y_test.shape)


############################# MLP model #################################

# model = Sequential()
# model.add(Dense(512, input_shape=(size_of_samples,)))
# model.add(Activation('relu'))
# model.add(Dropout(0.2))
# model.add(Dense(512))
# model.add(Activation('relu'))
# model.add(Dropout(0.2))
# model.add(Dense(nb_classes))
# model.add(Activation('softmax'))
#
# # model.summary()
#
# model.compile(loss="categorical_crossentropy", optimizer='adam', metrics=['accuracy'])
#
# history = model.fit(X_train, Y_train, batch_size=batch_size, nb_epoch=nb_epoch, verbose=1, validation_data=(X_test, Y_test))
# score = model.evaluate(X_test, Y_test, verbose=0)
# print('Test score:', score[0])
# print('Test accuracy:', score[1])


#################### Convolutional model ####################

model = Sequential()

model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1],
                        border_mode='valid',
                        input_shape=input_shape))
model.add(Activation('relu'))
model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1]))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=pool_size))
model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))

model.summary()

model.compile(loss='categorical_crossentropy',
              optimizer='adadelta',
              metrics=['accuracy'])

history = model.fit(X_train, Y_train, batch_size=batch_size, nb_epoch=nb_epoch, verbose=1, validation_data=(X_test, Y_test))
score = model.evaluate(X_test, Y_test, verbose=0)
print('Test score:', score[0])
print('Test accuracy:', score[1])

print(history.history.keys())
# summarize history for accuracy
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
#plt.show()

plt.savefig('CNN.png')

y_pred = model.predict_classes(X_test)
print(y_pred)

p = model.predict_proba(X_test)  # to predict probability

target_names = ['sample 0', 'sample 1', 'sample 2', 'sample 3', 'sample 4', 'sample 5', 'sample 6', 'sample 7', 'sample 8', 'sample 9']
print(classification_report(np.argmax(Y_test, axis=1), y_pred, target_names=target_names))
print(confusion_matrix(np.argmax(Y_test, axis=1), y_pred))

gc.collect()

提前感谢您的帮助。非常感谢任何帮助

0 个答案:

没有答案