Tensorflow / Keras,如何将tf.feature_column转换为输入张量?

时间:2019-03-29 16:06:29

标签: tensorflow keras

我有以下代码来平均嵌入项ID列表。 (嵌入是在review_meta_id_input上训练的,并用作查找pirors_input并获得平均嵌入的对象)

 review_meta_id_input = tf.keras.layers.Input(shape=(1,), dtype='int32', name='review_meta_id')
 priors_input = tf.keras.layers.Input(shape=(None,), dtype='int32', name='priors') # array of ids
 item_embedding_layer = tf.keras.layers.Embedding(
     input_dim=100,      # max number
     output_dim=self.item_embedding_size,
     name='item')

 review_meta_id_embedding = item_embedding_layer(review_meta_id_input)
 selected = tf.nn.embedding_lookup(review_meta_id_embedding, priors_input)
 non_zero_count =  tf.cast(tf.math.count_nonzero(priors_input, axis=1), tf.float32)
 embedding_sum = tf.reduce_sum(selected, axis=1)
 item_average = tf.math.divide(embedding_sum, non_zero_count)

我也有一些功能专栏,例如.. (我只是以为feature_column看起来很酷,但是找不到很多文档。)

  kid_youngest_month = feature_column.numeric_column("kid_youngest_month")
     kid_age_youngest_buckets = feature_column.bucketized_column(kid_youngest_month, boundaries=[12, 24, 36, 72, 96])

我想将[review_meta_id_iput, priors_input, (tensors from feature_columns)]定义为keras模型的输入。

类似:

 inputs = [review_meta_id_input, priors_input] + feature_layer
 model = tf.keras.models.Model(inputs=inputs, outputs=o)

为了从特征列获取张量,我现在拥有的最接近的引线是

fc_to_tensor = {fc: input_layer(features, [fc]) for fc in feature_columns}

来自https://github.com/tensorflow/tensorflow/issues/17170

但是我不确定代码中的features是什么。
https://www.tensorflow.org/api_docs/python/tf/feature_column/input_layer上也没有明确的示例。

如何为features构造fc_to_tensor变量?

还是可以同时使用keras.layers.Inputfeature_column

或者除了tf.feature_column之外,还有其他方法可以执行上述存储操作吗?那么我现在暂时删除feature_column;

1 个答案:

答案 0 :(得分:0)

您想要的行为可以通过以下步骤来实现。

这在TF 2.0.0-beta1中有效,但在以后的版本中可能会更改甚至简化。

请在TensorFlow github存储库Unable to use FeatureColumn with Keras Functional API #27416中签出问题。在那里,您会找到更通用的示例和有关tf.feature_columnKeras Functional API的有用注释。

同时,根据您问题中的代码,feature_column的输入张量可能如下所示:

# This you have defined feauture column
kid_youngest_month = feature_column.numeric_column("kid_youngest_month")
     kid_age_youngest_buckets = feature_column.bucketized_column(kid_youngest_month, boundaries=[12, 24, 36, 72, 96])

# Then define layer
feature_layer = tf.keras.layers.DenseFeatures(kid_age_youngest_buckets)

# The inputs for DenseFeature layer should be define for each original feature column as dictionary, where
# keys - names of feature columns
# values - tf.keras.Input with shape =(1,), name='name_of_feature_column', dtype - actual type of original column 
feature_layer_inputs = {}
feature_layer_inputs['kid_youngest_month'] = tf.keras.Input(shape=(1,), name='kid_youngest_month', dtype=tf.int8)

# Then you can collect inputs of other layers and feature_layer_inputs into one list 
inputs=[review_meta_id_input, priors_input, [v for v in feature_layer_inputs.values()]]

# Then define outputs of this DenseFeature layer
feature_layer_outputs = feature_layer(feature_layer_inputs)
# And pass them into other layer like any other
x = tf.keras.layers.Dense(256, activation='relu')(feature_layer_outputs)
# Or maybe concatenate them with outputs from your others layers
combined = tf.keras.layers.concatenate([x, feature_layer_outputs])

#And probably you will finish with last output layer, maybe like this for calssification
o=tf.keras.layers.Dense(classes_number, activation='softmax', name='sequential_output')(combined)

#So you pass to the model:

model_combined = tf.keras.models.Model(inputs=[s_inputs, [v for v in feature_layer_inputs.values()]], outputs=o)

也请注意。在模型fit()方法中,您应该传递信息,这些信息应用于每个输入。

一种方式,如果您使用tf.data.Dataset,请确保您为Dataset中的功能和feature_layer_inputs词典中的键使用了相同的名称

其他使用显式表示法的方式,例如:

model.fit({'review_meta_id_input': review_meta_id_data, 'priors_input': priors_data, 'kid_youngest_month': kid_youngest_month_data},
          {'outputs': o},
          ...
         )