import tensorflow as tf
feature_names = ['education']
d = dict(zip(feature_names, [["Bachelors","11th"]]))
print(d)
education_vocabulary_list = [
'Bachelors', 'HS-grad', '11th', 'Masters', '9th', 'Some-college',
'Assoc-acdm', 'Assoc-voc', '7th-8th', 'Doctorate', 'Prof-school',
'5th-6th', '10th', '1st-4th', 'Preschool', '12th']
education = tf.feature_column.categorical_column_with_vocabulary_list('education', vocabulary_list=education_vocabulary_list)
eductation_indicator = tf.feature_column.indicator_column(education)
feature_columns = [eductation_indicator]
print(feature_columns)
input_layer = tf.feature_column.input_layer(
features=d,
feature_columns=feature_columns
)
with tf.train.MonitoredTrainingSession() as sess:
print(input_layer)
print(sess.run(input_layer))
在上面的示例中,我得到以下输出
[<tf.Tensor 'input_layer/concat:0' shape=(2, 16) dtype=float32>][array([[1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]],
dtype=float32)]
我期望输出是链接(https://www.tensorflow.org/api_docs/python/tf/feature_column/indicator_column中提到的密集张量
基本上,我想对空格分隔的分类变量进行多次编码。然后,我将该列以及其他功能提供给DNNClassifer进行模型训练。
如何使用tensorflow的indicator_column功能对空格分隔的分类变量实现多热编码?