我正在尝试声明我将提供给tensorflow估算器的数字和分类列。
import tensorflow as tf
import pandas as pd
perform_shuffle = True
BATCH_SIZE = 1
repeat_count = 1
d = {
'c1': ["Sony", "Samsung", "Sony", "Sony", "Samsung"],
'n2': [24,20,18,26,24],
'n3': [1,0,0,1,1]
}
features = pd.DataFrame(data=d)
labels = [0,1,0,0,1]
numeric_feature_column1 = tf.feature_column.numeric_column(key="n2", dtype=tf.float32)
numeric_feature_column2 = tf.feature_column.numeric_column(key="n3", dtype=tf.float32)
categorical_column1 = tf.feature_column.categorical_column_with_hash_bucket(key="c1", hash_bucket_size=5)
feature_columns = [categorical_column1, numeric_feature_column1, numeric_feature_column2]
def my_input_fn(features, labels, perform_shuffle=False, repeat_count=1):
train_dataset = tf.data.Dataset.from_tensor_slices((features, labels))
if perform_shuffle:
# Randomizes input using a window of 512 elements (read into memory)
train_dataset = train_dataset.shuffle(buffer_size=BATCH_SIZE)
train_dataset = train_dataset.repeat(repeat_count) # Repeats dataset this # times
train_dataset = train_dataset.batch(BATCH_SIZE) # Batch size to use
# create a iterator of the correct shape and type
iterator = train_dataset.make_one_shot_iterator()
batch_features, batch_labels = iterator.get_next()
return batch_features, batch_labels
classifier = tf.estimator.DNNClassifier(
feature_columns=feature_columns,
hidden_units=[40, 60, 30, 12],
n_classes=2,
model_dir="./") # Path to where checkpoints etc are stored
classifier.train(input_fn=lambda: my_input_fn(features, labels, True, 1))
但即使我尝试这个简单的数据集,我也会收到以下错误:
TypeError:预期的二进制或unicode字符串,得到24
我尝试更改feature_columns数组中列的顺序,但它不会更改任何内容。第2列被声明为数字,因此它应该至少需要一个int或一个浮点数。 我在这里缺少什么?