Question

我想学习一个convnet，将大约240000个文档分类为> 240.000个文档。为此，我选择了前60个单词并将其转换为索引。我尝试在Keras中实现OneHot层，以避免内存问题，但是该模型的性能比已将数据准备为OneHot的模型差很多。真正的区别是什么？

除了附加的One_hot Lambda层外，模型摘要报告的形状和参数均相似。我使用了此处描述的One_Hot函数：https://fdalvi.github.io/blog/2018-04-07-keras-sequential-onehot/

def OneHot(input_dim=None, input_length=None): 
# input_dim refers to the eventual length of the one-hot vector (e.g. 
vocab size)
# input_length refers to the length of the input sequence
# Check if inputs were supplied correctly
if input_dim is None or input_length is None:
    raise TypeError("input_dim or input_length is not set")

# Helper method (not inlined for clarity)
def _one_hot(x, num_classes):
    return K.one_hot(K.cast(x, 'uint8'),
                      num_classes=num_classes)

# Final layer representation as a Lambda layer
return Lambda(_one_hot,
              arguments={'num_classes': input_dim},
              input_shape=(input_length,))

# Model A :  This is the Keras model I use with the OneHot function:
model = Sequential()
model.add(OneHot(input_dim=model_max,
                     input_length=input_length))
model.add(Conv1D(256, 6, activation='relu'))
model.add(Conv1D(64, 3, activation='relu'))
model.add(MaxPooling1D(3)) 
model.add(Conv1D(128, 3, activation='relu'))
model.add(Conv1D(128, 3, activation='relu'))
model.add(GlobalAveragePooling1D())
model.add(Dropout(0.5))
model.add(Dense(labels_max, activation='softmax'))
checkpoint = ModelCheckpoint('model-best.h5', verbose=1, 
monitor='val_loss',save_best_only=True, mode='auto')
model.compile(optimizer=Adam(),
          loss='categorical_crossentropy',
          metrics=['accuracy'])

#Model B: And this model I use with the data already converted to OneHot:
model = Sequential()
model.add(Conv1D(256, 6, activation='relu', input_shape=(input_length, 
model_max)))
model.add(Conv1D(64, 3, activation='relu'))
model.add(MaxPooling1D(3))
model.add(Conv1D(128, 3, activation='relu'))
model.add(Conv1D(128, 3, activation='relu'))
model.add(GlobalAveragePooling1D())
model.add(Dropout(0.5))
model.add(Dense(labels_max, activation='softmax'))
checkpoint = ModelCheckpoint('model-best.h5', verbose=1, 
monitor='val_loss',save_best_only=True, mode='auto')
model.compile(optimizer=Adam(),
          loss='categorical_crossentropy',
          metrics=['accuracy'])

模型B的性能要好得多，验证精度高达60％，但它很容易遇到内存错误。模型A更快，但最高只能达到25％的验证精度。我希望他们表现类似。我在这里想念什么？谢谢！

为什么Keras OneHot层实现与OneHot训练数据执行的功能不同？

0 个答案: