我正在尝试将我的2d数组适合LSTM,因为LSTM需要3d输入,所以我不知道该怎么做。我的.csv文件有21个cols,第一个是实例编号,最后一个是字符串类型Class label。其余的是功能列表。 csv文件形状(540,20)。
我尝试使用np.expand_dims将尺寸扩展到3d,但出现类似ValueError的错误:无法将大小为10800的数组重塑为形状(378,1,20)。
file = ("training_mfcc.csv")
cols = np.arange(21);
cols = [str(x) for x in cols]
cols[20]='Class'
dataset = pd.read_csv(file,header=None,names = cols)
# ---------------------------
# Preprocessing. i.e convert categorical to numeric
from sklearn.preprocessing import LabelEncoder
dataset['Class'] = LabelEncoder().fit_transform(dataset['Class'])
train,test=train_test_split(dataset,test_size=0.3,random_state=36,stratify=dataset['Class'])
X_train=train.drop('Class',axis=1)
Y_train=train['Class']
X_test=test.drop('Class',axis=1)
Y_test=test['Class']
X=dataset.drop('Class',axis=1)
Y=dataset['Class']
num_classes = len(set(Y))
X = np.array(X)
X = X.astype('float64')
X /= 255
Y = np.array(Y)
Y = Y.astype('float64')
Y_train = to_categorical(Y_train, num_classes)
Y_test = to_categorical(Y_test, num_classes)
Y = to_categorical(Y, num_classes)
print(X.shape)
X = X.reshape(X.shape[0],1, X.shape[1])
Y = Y.reshape(Y.shape[0],1, num_classes)
num_classes = 5
data_dim = 20
timesteps = 1
batch_size = 12
model = Sequential()
# model.add(Embedding(45, 15, input_length = (X.shape[1], 1), dropout = 0.3))
model.add(LSTM(32, return_sequences=True, stateful=True, batch_input_shape=(batch_size, timesteps, data_dim)))
model.add(LSTM(32, return_sequences=True, stateful=True))
model.add(Dense(num_classes, activation='softmax'))
# model.add(Flatten())
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'], early_stop = EarlyStopping(monitor='loss', patience=2))
model.summary()
model.fit(X_train, Y_train, batch_size =batch_size, epochs = 100,
verbose = 1, validation_data=(X_test, Y_test))
我希望代码能够动态调整并通过文件的实例和model.fit进行调整。