神经网络只预测一类

时间:2021-06-03 19:26:06

标签: machine-learning neural-network conv-neural-network tf.keras image-classification

我的模型只从二元类中预测一个类。该模型使用 Keras Video Frame Generator 获取视频输入,并为每个视频获取 350 帧。该模型必须采用 350 帧的输入序列,并使用 BLSTM 输出一个二进制类。输入形状为 (350, 112, 75, 3)。由于 OOM 错误,批大小为 2。我不知道这可能是问题还是代码中有问题,但模型似乎什么也没学到。这是代码:

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, BatchNormalization, MaxPool2D, GlobalMaxPool2D
def build_convnet(shape=(112, 75, 2)):
    momentum = .9
    model = Sequential()
    model.add(Conv2D(64, (3,3), input_shape=shape, padding='same', activation='relu'))
    model.add(Conv2D(64, (3,3), padding='same', activation='relu'))
    model.add(BatchNormalization(momentum=momentum))
    
    model.add(MaxPool2D())
    
    model.add(Conv2D(128, (3,3), padding='same', activation='relu'))
    model.add(Conv2D(128, (3,3), padding='same', activation='relu'))
    model.add(BatchNormalization(momentum=momentum))
    
    model.add(MaxPool2D())
    
    model.add(Conv2D(256, (3,3), padding='same', activation='relu'))
    model.add(Conv2D(256, (3,3), padding='same', activation='relu'))
    model.add(BatchNormalization(momentum=momentum))
    
    model.add(MaxPool2D())
    
    model.add(Conv2D(512, (3,3), padding='same', activation='relu'))
    model.add(Conv2D(512, (3,3), padding='same', activation='relu'))
    model.add(BatchNormalization(momentum=momentum))
    
    # flatten...
    model.add(GlobalMaxPool2D())
    
    return model
from tensorflow.keras.layers import LSTM, Bidirectional
from tensorflow.keras.layers import TimeDistributed, Dense, Dropout
def action_model(shape=(350, 112, 75, 3), nbout=2):
    # Create our convnet with (112, 75, 3) input shape
    convnet = build_convnet(shape[1:])
    
    # then create our final model
    model = Sequential()
    
    # add the convnet with (350, 112, 75, 3) shape
    model.add(TimeDistributed(convnet, input_shape=shape))
    model.add(Bidirectional(LSTM(units = 512, return_sequences = True, input_shape = (NBFRAME, 112*75*3))))
    model.add(Dropout(0.5))

    # Adding a second LSTM layer and Dropout layer
    model.add(Bidirectional(LSTM(units = 512, return_sequences = True)))
    model.add(Dropout(0.5))

    # Adding a third LSTM layer and Dropout layer
    model.add(Bidirectional(LSTM(units = 512)))
    model.add(Dropout(0.5))
    model.add(Dense(nbout, activation='softmax'))
    model.summary()
    return model

这是模型摘要。

img

2 个答案:

答案 0 :(得分:0)

我认为您在内存中放入了大量数据,您是否尝试过减少视频的帧数?这有帮助吗?

答案 1 :(得分:0)

你的模型太大了,我把这块去掉:

emodel.add(Conv2D(128, (3,3), padding='same', activation='relu'))
model.add(Conv2D(128, (3,3), padding='same', activation='relu')) 
model.add(BatchNormalization(momentum=momentum)) 
model.add(MaxPool2D())         
model.add(Conv2D(256, (3,3), padding='same', activation='relu')) 
model.add(Conv2D(256, (3,3), padding='same', activation='relu'))   
model.add(BatchNormalization(momentum=momentum))          
model.add(MaxPool2D())       
model.add(Conv2D(512, (3,3), padding='same', activation='relu'))  
model.add(Conv2D(512, (3,3), padding='same', activation='relu')) 
model.add(BatchNormalization(momentum=momentum))

也许它会导致 oom,也许你不需要它。