使用vgg16和LSTM进行视频分类的验证准确性较低

时间:2020-05-10 19:37:04

标签: keras

我的项目是关于使用视频数据集进行暴力分类的,所以我将所有视频转换为图像,每个视频转换为7张图像。

首先我使用vgg16从图像中提取特征,然后在此特征上训练我的LSTM,但是当我训练我的LSTM时,我得到了很差的精度,并且val_accuracy和奇怪的是,这两个值在许多类似我的精度的时期保持不变在大约50个纪元时仍保持0.5000,对于我的验证准确性,存在相同的问题。

这是我的代码,如果您能弄清楚问题出在哪里或为什么验证保持恒定并保持较低水平的话。

import pandas as pd
import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Convolution2D, MaxPooling2D, Flatten, Dense, Dropout, GlobalAveragePooling2D
from tensorflow.keras.applications import VGG16
from tensorflow.keras.models import Sequential 
from tensorflow.keras.layers import Conv2D, MaxPooling2D 
from tensorflow.keras.layers import Dense, Dropout 
from tensorflow.keras.layers import Flatten, BatchNormalization ,LSTM
import os, shutil
from keras.preprocessing.image import ImageDataGenerator
import keras 

conv = tf.keras.applications.vgg16.VGG16()

model = Sequential()

在这里,我将vgg16复制到顺序模型中

for layer in conv.layers:
  model.add(layer)

在这里,我摆脱了所有密集和扁平的层,使我的最后一层maxpool具有(7,7,512)形状

model.pop()
model.pop()
model.pop()
model.pop()

import os, shutil
from keras.preprocessing.image import ImageDataGenerator

datagen = ImageDataGenerator()
batch_size = 1
img_width, img_height = 224, 224  # Default input size for VGG16

此功能用于提取每个图片的功能,我的参数是文件路径和此文件中图像的数量。

def extract_features(directory, sample_count):
    features = np.zeros(shape=(sample_count, 7, 7, 512))  # Must be equal to the output of the 
    convolutional base
    labels = np.zeros(shape=(sample_count,2))
    # Preprocess data
    generator = datagen.flow_from_directory(directory,
                                            target_size=(img_width,img_height),
                                            batch_size = batch_size,
                                            class_mode='binary')
    # Pass data through convolutional base
    i = 0
    for inputs_batch, labels_batch in generator:
        features_batch = model.predict(inputs_batch)
        features[i * batch_size: (i + 1) * batch_size] = features_batch
        labels[i * batch_size: (i + 1) * batch_size] = labels_batch
        i += 1
        if i * batch_size >= sample_count:
            break
    return features, labels

train_violence="/content/drive/My Drive/images/one/data/train"
train_non="/content/drive/My Drive/images/two/data/train"
valid_violence="/content/drive/My Drive/images/three/data/validation"
valid_non="/content/drive/My Drive/images/four/data/validation"


train_violence_features, train_violence_labels = extract_features(train_violence,119)
train_non_features , train_non_labels = extract_features(train_non,119)
valid_violence_features , valid_violence_labels = extract_features(valid_violence,77) 
valid_non_features , valid_non_labels = extract_features(valid_non,77) 

现在我有针对暴力的功能以及针对暴力和非暴力功能的培训和验证功能,因此我需要将我的2个数组连接起来进行训练,以使其具有所有暴力功能的一个数组再变为非暴力功能,因为我发现该目录流函数从2个类别中随机获取图像,但我需要将其作为一个序列,因此我需要每7张照片作为一个序列,因此它不能从2个类别中随机排列,因此我使用了4个数组2个验证数组,其中一个用于暴力,另一个用于非暴力内容,并且在培训中也是如此,然后我将它们串联起来,为每个视频保持正确的顺序。

x= np.concatenate((train_violence_features, train_non_features))
y = np.concatenate((valid_violence_features, valid_non_features))

现在x的形状为(238、7、7、512),即228张照片,y的验证形状为(154、7、7、512)

在这里,我将LSTM的输入调整为(样本,时间步长,特征)的形状,这将是用于训练的34个视频,每个视频都转换为7个图像,所以7是我的时间步长,而7 * 7 * 512的特征数等于25088

lstm_train_sample = np.reshape(x,(34,7,25088))
lstm_validation_sample = np.reshape(y,(22,7,25088)) 

在这里我制作标签->每个视频的标签

t_labels = [1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
v_labels = [1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0]
t_labels= keras.utils.to_categorical(t_labels, num_classes=2, dtype='float32')
v_labels= keras.utils.to_categorical(v_labels, num_classes=2, dtype='float32')

最后我的LSTM:

lstm = Sequential()
lstm.add(LSTM(200, activation='relu', return_sequences=True, input_shape=(7, 25088)))
lstm.add(LSTM(25, activation='relu'))
lstm.add(Dense(20, activation='relu'))
lstm.add(Dense(10, activation='relu'))
lstm.add(Dense(2))
lstm.compile(optimizer='adam', loss='mse')


lstm.compile(optimizer='adam', loss='mse',metrics=['accuracy'])

lstm.fit(lstm_train_sample , t_labels , epochs=100 , batch_size=2 , validation_data= 
(lstm_validation_sample,v_labels) , validation_batch_size= 2 )

这是结果的一个例子,我想知道为什么会这样:

Epoch 10/30
12/12 [==============================] - 3s 216ms/step - loss: 0.4742 - accuracy: 0.7870 - val_loss: 1.3731 - val_accuracy: 0.5000
Epoch 11/30
12/12 [==============================] - 3s 218ms/step - loss: 0.6105 - accuracy: 0.7315 - val_loss: 1.2164 - val_accuracy: 0.5455
Epoch 12/30
12/12 [==============================] - 3s 216ms/step - loss: 0.4375 - accuracy: 0.7963 - val_loss: 1.3677 - val_accuracy: 0.4545
Epoch 13/30
12/12 [==============================] - 3s 216ms/step - loss: 0.4266 - accuracy: 0.7778 - val_loss: 1.5407 - val_accuracy: 0.4545
Epoch 14/30
12/12 [==============================] - 3s 222ms/step - loss: 0.3353 - accuracy: 0.8241 - val_loss: 1.4536 - val_accuracy: 0.4545
Epoch 15/30
12/12 [==============================] - 3s 216ms/step - loss: 0.4668 - accuracy: 0.8333 - val_loss: 1.5110 - val_accuracy: 0.4545
Epoch 16/30
12/12 [==============================] - 3s 221ms/step - loss: 0.4570 - accuracy: 0.7593 - val_loss: 1.5104 - val_accuracy: 0.5455
Epoch 17/30
12/12 [==============================] - 3s 219ms/step - loss: 0.2530 - accuracy: 0.8981 - val_loss: 1.3679 - val_accuracy: 0.5000
Epoch 18/30
12/12 [==============================] - 3s 220ms/step - loss: 0.3998 - accuracy: 0.8981 - val_loss: 1.4008 - val_accuracy: 0.5000
Epoch 19/30
12/12 [==============================] - 3s 220ms/step - loss: 0.3802 - accuracy: 0.8333 - val_loss: 1.4044 - val_accuracy: 0.5000
Epoch 20/30
12/12 [==============================] - 3s 223ms/step - loss: 0.4440 - accuracy: 0.8333 - val_loss: 1.4973 - val_accuracy: 0.5000
Epoch 21/30
12/12 [==============================] - 3s 223ms/step - loss: 0.4340 - accuracy: 0.8519 - val_loss: 1.5175 - val_accuracy: 0.5455
Epoch 22/30
12/12 [==============================] - 3s 224ms/step - loss: 0.3418 - accuracy: 0.8704 - val_loss: 1.2189 - val_accuracy: 0.5909
Epoch 23/30
12/12 [==============================] - 3s 219ms/step - loss: 0.2384 - accuracy: 0.8981 - val_loss: 1.3633 - val_accuracy: 0.5455
Epoch 24/30
12/12 [==============================] - 3s 221ms/step - loss: 0.3621 - accuracy: 0.8981 - val_loss: 1.4771 - val_accuracy: 0.5909
Epoch 25/30
12/12 [==============================] - 3s 223ms/step - loss: 0.2561 - accuracy: 0.8889 - val_loss: 1.6170 - val_accuracy: 0.5455
Epoch 26/30
12/12 [==============================] - 3s 223ms/step - loss: 0.2361 - accuracy: 0.9074 - val_loss: 1.5460 - val_accuracy: 0.5000

0 个答案:

没有答案