在feature_column
中使用Tensorflow
API的所有示例中,它们展示了如何在input_fn
中创建原始要素,然后创建定义所需的feature_column
数组映射,然后传递给Estimator
。在运行时,Estimator
然后将两个组合在一起,并进行实际的特征编码。如何在Estimator
API之外执行此操作?我已经查看了Tensorflow
的源代码并空手而归。
以下是一些可用于演示我需要的源代码。我想使用age-buckets
和education
创建一个年龄组合的功能,结果为[2,0]
。
import tensorflow as tf
feature_names = [
'age','education']
label_names = [
'>50K',
'<=50K']
d = dict(zip(feature_names, [34, 'Bachelors'])), '>50K'
print(d)
with tf.Session() as sess:
age = tf.feature_column.numeric_column('age')
age_buckets = tf.feature_column.bucketized_column(age, boundaries=[18, 25, 30, 35, 40, 45, 50, 55, 60, 65])
education = tf.feature_column.categorical_column_with_vocabulary_list(
'education', [
'Bachelors', 'HS-grad', '11th', 'Masters', '9th', 'Some-college',
'Assoc-acdm', 'Assoc-voc', '7th-8th', 'Doctorate', 'Prof-school',
'5th-6th', '10th', '1st-4th', 'Preschool', '12th'])
base_columns = [age_buckets, education]
print(base_columns)
答案 0 :(得分:0)
事实证明,除了tf.feature_column.input_layer
之外,您还需要使用tf.train.MonitoredTrainingSession()
来初始化所需的表格。
import tensorflow as tf
feature_names = [
'age','education']
d = dict(zip(feature_names, [[34], ['Bachelors']])), '>50K'
print(d[0])
age = tf.feature_column.numeric_column('age')
age_buckets = tf.feature_column.bucketized_column(age, boundaries=[18, 25, 30, 35, 40, 45, 50, 55, 60, 65])
education_vocabulary_list = [
'Bachelors', 'HS-grad', '11th', 'Masters', '9th', 'Some-college',
'Assoc-acdm', 'Assoc-voc', '7th-8th', 'Doctorate', 'Prof-school',
'5th-6th', '10th', '1st-4th', 'Preschool', '12th']
education = tf.feature_column.categorical_column_with_vocabulary_list('education', vocabulary_list=education_vocabulary_list)
eductation_indicator = tf.feature_column.indicator_column(education)
feature_columns = [age_buckets, eductation_indicator]
print(feature_columns)
input_layer = tf.feature_column.input_layer(
features=d[0],
feature_columns=feature_columns
)
zero = tf.constant(0, dtype=tf.float32)
where = tf.not_equal(input_layer, zero)
indices = tf.where(where)
values = tf.gather_nd(input_layer, indices)
sparse = tf.SparseTensor(indices, values, input_layer.shape)
with tf.train.MonitoredTrainingSession() as sess:
print(input_layer)
print(sess.run(input_layer))
print(sess.run(sparse))