如何使用使用Keras框架在Wikipedia和新语料库上训练的预训练ELMo模型(5.5B)

时间:2019-05-01 07:16:17

标签: keras elmo

如何在Keras框架中访问在Wikipedia + New Corpus(5.5B)上训练的ELMo嵌入以进行文本分类?

我无法识别,如何访问在Wikipedia + New Corpus(5.5B)上训练的ELMo模型

我已编写此代码以使用Elmo model(trained on 1 Billion Word Corpus) for text classification using tensorflow hub and Keras

from keras.layers import Input, Lambda, LSTM,Embedding,concatenate,BatchNormalization,Dense
from keras.models import Model
import keras.backend as K
import tensorflow as tf
import tensorflow_hub as tf_hub

sess = tf.Session()
K.set_session(sess)

def ELMoEmbedding(x):
    return elmo_model(tf.squeeze(tf.cast(x, tf.string)), signature="default", as_dict=True) ["elmo"]

#elmo vectors
elmo_input_layer = Input(shape=(1, ), dtype=tf.string)

embed_elmo = Lambda(ELMoEmbedding, output_shape=(max_sentence_length,1024,))(elmo_input_layer)

lstm = LSTM(256, dropout=0.2, recurrent_dropout=0.2)(embed_elmo)

output_layer = Dense(4, activation='sigmoid')(lstm)

model = Model(inputs=elmo_input_layer, outputs=output_layer)

model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

print(model.summary())

我在链接中看到以下代码,以访问在5.5B语料库(https://github.com/allenai/allennlp/blob/master/tutorials/how_to/elmo.md)上训练的ELMo模型


from allennlp.modules.elmo import Elmo, batch_to_ids

options_file = "https://s3-us-west-2.amazonaws.com/allennlp/models/elmo/2x4096_512_2048cnn_2xhighway/elmo_2x4096_512_2048cnn_2xhighway_options.json"

weight_file = "https://s3-us-west-2.amazonaws.com/allennlp/models/elmo/2x4096_512_2048cnn_2xhighway/elmo_2x4096_512_2048cnn_2xhighway_weights.hdf5"

# Compute two different representation for each token.
# Each representation is a linear weighted combination for the
# 3 layers in ELMo (i.e., charcnn, the outputs of the two BiLSTM))
elmo = Elmo(options_file, weight_file, 2, dropout=0)

# use batch_to_ids to convert sentences to character ids
sentences = [['First', 'sentence', '.'], ['Another', '.']]
character_ids = batch_to_ids(sentences)

embeddings = elmo(character_ids)

但是,我无法理解如何在上述Keras文本分类模型中使用此代码。

0 个答案:

没有答案