将CNN与两个输入一起用于预测

时间:2019-05-15 21:59:36

标签: python machine-learning keras conv-neural-network

我有一个像这样的数据集:

q1    q2    label
ccc   ddd     1
zzz   yyy     0
 .     .      .
 .     .      .

其中q1和q2是句子,标签会说它们是否重复。

现在我很困惑,因为我有两个输入q1和q2,所以如何将它们两个都串联起来进行预测。我为两个列都创建了两个CNN函数,然后我想对其进行串联。

我的cnn函数:

       def cnn_model(FILTER_SIZES, \
               # filter sizes as a list
               MAX_NB_WORDS, \
               # total number of words
               MAX_DOC_LEN, \
               # max words in a doc
              EMBEDDING_DIM=200, \
              # word vector dimension
              NUM_FILTERS=64, \
              # number of filters for all size
              DROP_OUT=0.5, \
          # dropout rate
          NUM_OUTPUT_UNITS=1, \
          # number of output units
          NUM_DENSE_UNITS=100,\
          # number of units in dense layer
          PRETRAINED_WORD_VECTOR=None,\
          # Whether to use pretrained word vectors
          LAM=0.0):            
          # regularization coefficient

       main_input = Input(shape=(MAX_DOC_LEN,), \
                   dtype='int32', name='main_input')

      if PRETRAINED_WORD_VECTOR is not None:
        embed_1 = Embedding(input_dim=MAX_NB_WORDS+1, \
                    output_dim=EMBEDDING_DIM, \
                    input_length=MAX_DOC_LEN, \
                    # use pretrained word vectors
                    weights=[PRETRAINED_WORD_VECTOR],\
                    # word vectors can be further tuned
                    # set it to False if use static word vectors
                    trainable=True,\
                    name='embedding')(main_input)
    else:
       embed_1 = Embedding(input_dim=MAX_NB_WORDS+1, \
                    output_dim=EMBEDDING_DIM, \
                    input_length=MAX_DOC_LEN, \
                    name='embedding')(main_input)
   # add convolution-pooling-flat block
   conv_blocks = []
   for f in FILTER_SIZES:
      conv = Conv1D(filters=NUM_FILTERS, kernel_size=f, \
                  activation='relu', name='conv_'+str(f))(embed_1)
      conv = MaxPooling1D(MAX_DOC_LEN-f+1, name='max_'+str(f))(conv)
      conv = Flatten(name='flat_'+str(f))(conv)
      conv_blocks.append(conv)

   if len(conv_blocks)>1:
       z=Concatenate(name='concate')(conv_blocks)
   else:
       z=conv_blocks[0]


   dense = Dense(NUM_DENSE_UNITS, activation='relu',\
                kernel_regularizer=l2(LAM),name='dense')(drop)

   model = Model(inputs=main_input, outputs=dense)

   model.compile(loss="binary_crossentropy", \
          optimizer="adam", metrics=["accuracy"]) 

  return model

首先,我对两列都进行填充排序:

   tokenizer = Tokenizer(num_words=MAX_NB_WORDS)
   tokenizer.fit_on_texts(data["q1"])

   # set the dense units
   dense_units_num= num_filters*len(FILTER_SIZES)

   BTACH_SIZE = 32
   NUM_EPOCHES = 100

   sequences_1 = tokenizer.\
   texts_to_sequences(data["q1"])
   # print(sequences_1)

  sequences_2 = tokenizer.\
  texts_to_sequences(data["q2"])

  sequences =  sequences_1 + sequences_2

  output_units_num=1



  # pad all sequences into the same length 
  # if a sentence is longer than maxlen, pad it in the right
  # if a sentence is shorter than maxlen, truncate it in the right 
   padded_sequences = pad_sequences(sequences, \
                             maxlen=MAX_DOC_LEN, \
                             padding='post', \
                             truncating='post')`

现在对于这两列,我都制作了两个这样的模型:

    left_cnn=cnn_model(FILTER_SIZES, MAX_NB_WORDS, \
            MAX_DOC_LEN, \
            NUM_FILTERS=num_filters,\
            NUM_OUTPUT_UNITS=output_units_num, \
            NUM_DENSE_UNITS=dense_units_num,\
            PRETRAINED_WORD_VECTOR= None)

   right_cnn=cnn_model(FILTER_SIZES, MAX_NB_WORDS, \
            MAX_DOC_LEN, \
            NUM_FILTERS=num_filters,\
            NUM_OUTPUT_UNITS=output_units_num, \
            NUM_DENSE_UNITS=dense_units_num,\
            PRETRAINED_WORD_VECTOR= None)

现在我不知道如何串联这两个模型。接下来该怎么做!

1 个答案:

答案 0 :(得分:0)

您可以在CNN模型中具有多个输入,如下所示:

inputLayerQ1 = Input(shape=(10,1))
inputLayerQ2 = Input(shape=(10,1))

conv1 = Conv1D(10,5, strides=1, input_shape=(10,1), padding='same',
                        dilation_rate=1, activation='relu')

convQ1 = conv1(inputLayerQ1)  
poolLayerQ1 = MaxPool1D(pool_size=5, strides=1, padding='valid')(convQ1)
convQ2 = conv1(inputLayerQ2) 
poolLayerQ2 = MaxPool1D(pool_size=5, strides=1, padding='valid')(convQ2)

concatLayerQ = concatenate([poolLayerQ1, poolLayerQ2], axis = 1)
flatLayerQ = Flatten()(concatLayerQ)
denseLayerQ = Dense(10, activation='relu')(flatLayerQ)

outputLayer = Dense(2, activation='softmax')(denseLayer)

model = Model(inputs=[inputLayerQ1, inputLayerQ2], outputs=outputLayer)
model.compile(loss='categorical_crossentropy', optimizer=Adam(lr=0.01), metrics=['accuracy'])

model.fit([inputQ1, inputQ2],outputLabel , epochs=10, steps_per_epoch=10)

这是一个简单的3层网络。每个句子都有一个Conv层,Conv层的结果连接在一起以创建密集层的输入,最后是一个输出密集层,该密集层具有两个不同的输出,两个输出在0类和1类之间进行区分。