我是初次使用keras。我的数据集很大,所以我正在重新编写代码以批量工作。我的发电机在这里:
def batch_generator(csv_file,chunk_size,
steps, var_list):
idx=1
while True:
yield load_data(csv_file,idx-1,chunk_size,var_list)## Yields data
if idx<steps:
idx+=1
else:
idx=1
def load_data(csv_file,idx,
chunk_size, var_list):
global col_names
if idx == 0:
df = pd.read_csv(
csv_file,
nrows=chunk_size)
col_names = df.columns
else:
df = pd.read_csv(
csv_file, skiprows=idx*chunk_size,
nrows=chunk_size,
header=None,names = col_names)
x = df[var_list]
y = df['targets_LJ']
return (np.array(x), to_categorical(y))
还有我的代码的机器学习部分:
#create iterator over dataframe
train_gen = batch_generator(filepath_train, chunk_size, steps, list_of_vars)
val_gen = batch_generator(filepath_val, chunk_size, steps_val, list_of_vars)
# now make the network
from keras.layers import Input, Dense, Softmax
from keras.models import Model
#layers are functions that construct the deep learning model
#tensors define the data flow through the model
input_tensor = Input(shape = (len(list_of_vars),))
node1_layer = Dense(2)
node1_tensor = node1_layer(input_tensor)
output_layer = Softmax()
output_tensor = output_layer(node1_tensor)
#build model
model = Model(input_tensor, output_tensor)
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
#create early stopping for if the nn is not improving
from keras.callbacks import EarlyStopping
early_stop = EarlyStopping(monitor='val_loss', patience=2)
#fit model
history = model.fit_generator(generator=train_gen,
validation_data=val_gen,
steps_per_epoch=steps, epochs=args.epochs, validation_steps=steps_val, callbacks=[early_stop])
在从fit切换到fit_generator之前,我遇到了一个我没有得到的错误:
Traceback (most recent call last):
File "./train_nn.py", line 162, in <module>
run()
File "./train_nn.py", line 145, in run
steps_per_epoch=steps, epochs=args.epochs, validation_steps=steps_val, callbacks=[early_stop])
File "/opt/ohpc/pub/packages/anaconda3/lib/python3.7/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/opt/ohpc/pub/packages/anaconda3/lib/python3.7/site-packages/keras/engine/training.py", line 1418, in fit_generator
initial_epoch=initial_epoch)
File "/opt/ohpc/pub/packages/anaconda3/lib/python3.7/site-packages/keras/engine/training_generator.py", line 217, in fit_generator
class_weight=class_weight)
File "/opt/ohpc/pub/packages/anaconda3/lib/python3.7/site-packages/keras/engine/training.py", line 1211, in train_on_batch
class_weight=class_weight)
File "/opt/ohpc/pub/packages/anaconda3/lib/python3.7/site-packages/keras/engine/training.py", line 789, in _standardize_user_data
exception_prefix='target')
File "/opt/ohpc/pub/packages/anaconda3/lib/python3.7/site-packages/keras/engine/training_utils.py", line 138, in standardize_input_data
str(data_shape))
ValueError: Error when checking target: expected softmax_1 to have shape (2,) but got array with shape (1,)
我不知道这是怎么了。我正在使用'categorical_crossentropy'
,但据我所知,我的目标是明确的,而且据我所知,这些目标应该可以协同工作。
谢谢, 莎拉
答案 0 :(得分:0)
您的模型的输出形状为(2,)
,因为您的最后一层有2个单位。
由于您使用的是"softmax"
,所以我想您正在执行二进制分类,对吧?
但是您的数据的形状为(1,)
,这意味着您没有两个类!您只有一堂课。在通常的二进制分类中,您将数据分为零(一个类)和一个(另一个类)
如果是这种情况,您的最后一层必须仅包含1个单位。您的上一次激活应为'sigmoid'
,而丢失应为'binary_crossentropy'
。这样,您无需更改数据。