我有一个像这样的数据集:
q1 q2 label
ccc ddd 1
zzz yyy 0
. . .
. . .
其中q1和q2是句子,标签会说它们是否重复。
现在我很困惑,因为我有两个输入q1和q2,所以如何将它们两个都串联起来进行预测。我为两个列都创建了两个CNN函数,然后我想对其进行串联。
我的cnn函数:
def cnn_model(FILTER_SIZES, \
# filter sizes as a list
MAX_NB_WORDS, \
# total number of words
MAX_DOC_LEN, \
# max words in a doc
EMBEDDING_DIM=200, \
# word vector dimension
NUM_FILTERS=64, \
# number of filters for all size
DROP_OUT=0.5, \
# dropout rate
NUM_OUTPUT_UNITS=1, \
# number of output units
NUM_DENSE_UNITS=100,\
# number of units in dense layer
PRETRAINED_WORD_VECTOR=None,\
# Whether to use pretrained word vectors
LAM=0.0):
# regularization coefficient
main_input = Input(shape=(MAX_DOC_LEN,), \
dtype='int32', name='main_input')
if PRETRAINED_WORD_VECTOR is not None:
embed_1 = Embedding(input_dim=MAX_NB_WORDS+1, \
output_dim=EMBEDDING_DIM, \
input_length=MAX_DOC_LEN, \
# use pretrained word vectors
weights=[PRETRAINED_WORD_VECTOR],\
# word vectors can be further tuned
# set it to False if use static word vectors
trainable=True,\
name='embedding')(main_input)
else:
embed_1 = Embedding(input_dim=MAX_NB_WORDS+1, \
output_dim=EMBEDDING_DIM, \
input_length=MAX_DOC_LEN, \
name='embedding')(main_input)
# add convolution-pooling-flat block
conv_blocks = []
for f in FILTER_SIZES:
conv = Conv1D(filters=NUM_FILTERS, kernel_size=f, \
activation='relu', name='conv_'+str(f))(embed_1)
conv = MaxPooling1D(MAX_DOC_LEN-f+1, name='max_'+str(f))(conv)
conv = Flatten(name='flat_'+str(f))(conv)
conv_blocks.append(conv)
if len(conv_blocks)>1:
z=Concatenate(name='concate')(conv_blocks)
else:
z=conv_blocks[0]
dense = Dense(NUM_DENSE_UNITS, activation='relu',\
kernel_regularizer=l2(LAM),name='dense')(drop)
model = Model(inputs=main_input, outputs=dense)
model.compile(loss="binary_crossentropy", \
optimizer="adam", metrics=["accuracy"])
return model
首先,我对两列都进行填充排序:
tokenizer = Tokenizer(num_words=MAX_NB_WORDS)
tokenizer.fit_on_texts(data["q1"])
# set the dense units
dense_units_num= num_filters*len(FILTER_SIZES)
BTACH_SIZE = 32
NUM_EPOCHES = 100
sequences_1 = tokenizer.\
texts_to_sequences(data["q1"])
# print(sequences_1)
sequences_2 = tokenizer.\
texts_to_sequences(data["q2"])
sequences = sequences_1 + sequences_2
output_units_num=1
# pad all sequences into the same length
# if a sentence is longer than maxlen, pad it in the right
# if a sentence is shorter than maxlen, truncate it in the right
padded_sequences = pad_sequences(sequences, \
maxlen=MAX_DOC_LEN, \
padding='post', \
truncating='post')`
现在对于这两列,我都制作了两个这样的模型:
left_cnn=cnn_model(FILTER_SIZES, MAX_NB_WORDS, \
MAX_DOC_LEN, \
NUM_FILTERS=num_filters,\
NUM_OUTPUT_UNITS=output_units_num, \
NUM_DENSE_UNITS=dense_units_num,\
PRETRAINED_WORD_VECTOR= None)
right_cnn=cnn_model(FILTER_SIZES, MAX_NB_WORDS, \
MAX_DOC_LEN, \
NUM_FILTERS=num_filters,\
NUM_OUTPUT_UNITS=output_units_num, \
NUM_DENSE_UNITS=dense_units_num,\
PRETRAINED_WORD_VECTOR= None)
现在我不知道如何串联这两个模型。接下来该怎么做!
答案 0 :(得分:0)
您可以在CNN模型中具有多个输入,如下所示:
inputLayerQ1 = Input(shape=(10,1))
inputLayerQ2 = Input(shape=(10,1))
conv1 = Conv1D(10,5, strides=1, input_shape=(10,1), padding='same',
dilation_rate=1, activation='relu')
convQ1 = conv1(inputLayerQ1)
poolLayerQ1 = MaxPool1D(pool_size=5, strides=1, padding='valid')(convQ1)
convQ2 = conv1(inputLayerQ2)
poolLayerQ2 = MaxPool1D(pool_size=5, strides=1, padding='valid')(convQ2)
concatLayerQ = concatenate([poolLayerQ1, poolLayerQ2], axis = 1)
flatLayerQ = Flatten()(concatLayerQ)
denseLayerQ = Dense(10, activation='relu')(flatLayerQ)
outputLayer = Dense(2, activation='softmax')(denseLayer)
model = Model(inputs=[inputLayerQ1, inputLayerQ2], outputs=outputLayer)
model.compile(loss='categorical_crossentropy', optimizer=Adam(lr=0.01), metrics=['accuracy'])
model.fit([inputQ1, inputQ2],outputLabel , epochs=10, steps_per_epoch=10)
这是一个简单的3层网络。每个句子都有一个Conv层,Conv层的结果连接在一起以创建密集层的输入,最后是一个输出密集层,该密集层具有两个不同的输出,两个输出在0类和1类之间进行区分。