我有20个频道数据,每个数据有5000个值(HD上存储的.npy文件总共有150,000多条记录)。
我正在关注https://stanford.edu/~shervine/blog/keras-how-to-generate-data-on-the-fly.html上提供的keras fit_generator教程来读取数据(每条记录读取为(5000,20)numpy数组,类型为float32。
我已经理论化的网络,每个通道都有并行卷积网络,最后连接到网络,因此需要并行提供数据。 仅从数据中读取和馈送单个通道并馈送到单个网络是成功的
def __data_generation(self, list_IDs_temp):
'Generates data containing batch_size samples' # X : (n_samples, *dim, n_channels)
# Initialization
if(self.n_channels == 1):
X = np.empty((self.batch_size, *self.dim))
else:
X = np.empty((self.batch_size, *self.dim, self.n_channels))
y = np.empty((self.batch_size), dtype=int)
# Generate data
for i, ID in enumerate(list_IDs_temp):
# Store sample
d = np.load(self.data_path + ID + '.npy')
d = d[:, self.required_channel]
d = np.expand_dims(d, 2)
X[i,] = d
# Store class
y[i] = self.labels[ID]
return X, keras.utils.to_categorical(y, num_classes=self.n_classes)
然而,在阅读整个记录并尝试使用Lambda图层切片将其提供给网络时,我得到了
阅读整个记录
X[i,] = np.load(self.data_path + ID + '.npy')
使用Lambda切片层实现:https://github.com/keras-team/keras/issues/890并调用
input = Input(shape=(5000, 20))
slicedInput = crop(2, 0, 1)(input)
我能够编译模型并显示预期的图层大小。
当数据被送到这个网络时,我得到了
ValueError: could not broadcast input array from shape (5000,20) into shape (5000,1)
非常感谢任何帮助......
答案 0 :(得分:5)
正如您引用的Github thread中所提到的,Lambda
图层只能返回一个输出,因此建议的crop(dimension, start, end)
只返回给定的一个" Tensor从开始到结束的维度"。
我相信你想要实现的目标可以这样做:
from keras.layers import Dense, Concatenate, Input, Lambda
from keras.models import Model
num_channels = 20
input = Input(shape=(5000, num_channels))
branch_outputs = []
for i in range(num_channels):
# Slicing the ith channel:
out = Lambda(lambda x: x[:, i])(input)
# Setting up your per-channel layers (replace with actual sub-models):
out = Dense(16)(out)
branch_outputs.append(out)
# Concatenating together the per-channel results:
out = Concatenate()(branch_outputs)
# Adding some further layers (replace or remove with your architecture):
out = Dense(10)(out)
# Building model:
model = Model(inputs=input, outputs=out)
model.compile(optimizer=keras.optimizers.Adam(lr=0.001), loss='categorical_crossentropy', metrics=['accuracy'])
# --------------
# Generating dummy data:
import numpy as np
data = np.random.random((64, 5000, num_channels))
targets = np.random.randint(2, size=(64, 10))
# Training the model:
model.fit(data, targets, epochs=2, batch_size=32)
# Epoch 1/2
# 32/64 [==============>...............] - ETA: 1s - loss: 37.1219 - acc: 0.1562
# 64/64 [==============================] - 2s 27ms/step - loss: 38.4801 - acc: 0.1875
# Epoch 2/2
# 32/64 [==============>...............] - ETA: 0s - loss: 38.9541 - acc: 0.0938
# 64/64 [==============================] - 0s 4ms/step - loss: 36.0179 - acc: 0.1875