我是使用LSTM的CNN的新手。我有一个图像序列的数据集(大小各异) 数据集包含来自70个视频的图像,每个(50-70帧) 另外,我将这些对象的位置数组和它的速度的另一个数组作为这些帧的输出(因此有2个输出,2个密集层)。
这是我当前的代码:
import keras
from keras.layers import *
from keras.models import load_model, Model, Sequential
from keras.preprocessing.image import ImageDataGenerator, array_to_img, img_to_array, load_img
import pandas as pd
import numpy as np
model = Sequential()
num_frames = ??
width = 256
height = 256
channels = 3
Input_1 = Input(shape=(num_frames, width, height, 3))
x = TimeDistributed(Conv2D(64, (3, 3), activation='relu'))(Input_1)
x = TimeDistributed(MaxPooling2D((2, 2), strides=(1, 1)))(x)
x = TimeDistributed(Conv2D(128, (4,4), activation='relu'))(x)
x = TimeDistributed(MaxPooling2D((2, 2), strides=(2, 2)))(x)
x = TimeDistributed(Conv2D(256, (4,4), activation='relu'))(x)
x = TimeDistributed(MaxPooling2D((2, 2), strides=(2, 2)))(x)
x = TimeDistributed(Flatten())(x)
x = Dropout(0.5)(x)
x = LSTM(256, return_sequences=False, dropout=0.5)(x)
out1 = Dense(3, activation='linear')(x) #position
out2 = Dense(7, activation='linear')(x) #speed
model = Model(inputs=Input_1, outputs=[out1,out2])
model.compile(optimizer = "adam", loss = 'mse')
model.summary()
这是我的问题: 1.我的数据存储在熊猫数据框中,其中每一行代表(frame_time,frame_img_file_name,位置,速度)。如何以使其顺序输入的方式阅读它?另外,如何将其加载到Keras模型中? 2.为了检查我的理解,我们在此输入每个独立的视频帧。其他视频中?
如果有人可以帮助我,我将非常感谢!