预期的二进制或unicode字符串得到24:Tensorflow数据集数字/分类列

时间:2018-03-01 15:14:33

标签: python pandas tensorflow tensorflow-datasets

我正在尝试声明我将提供给tensorflow估算器的数字和分类列。

import tensorflow as tf
import pandas as pd

perform_shuffle = True
BATCH_SIZE = 1
repeat_count = 1

d = {
    'c1': ["Sony", "Samsung", "Sony", "Sony", "Samsung"], 
    'n2': [24,20,18,26,24],
    'n3': [1,0,0,1,1]
    }
features = pd.DataFrame(data=d)
labels = [0,1,0,0,1]



numeric_feature_column1 = tf.feature_column.numeric_column(key="n2", dtype=tf.float32)
numeric_feature_column2 = tf.feature_column.numeric_column(key="n3", dtype=tf.float32)
categorical_column1 = tf.feature_column.categorical_column_with_hash_bucket(key="c1", hash_bucket_size=5)
feature_columns = [categorical_column1, numeric_feature_column1, numeric_feature_column2]

def my_input_fn(features, labels, perform_shuffle=False, repeat_count=1):

    train_dataset = tf.data.Dataset.from_tensor_slices((features, labels))

    if perform_shuffle:
        # Randomizes input using a window of 512 elements (read into memory)
        train_dataset = train_dataset.shuffle(buffer_size=BATCH_SIZE)
    train_dataset = train_dataset.repeat(repeat_count) # Repeats dataset this # times
    train_dataset = train_dataset.batch(BATCH_SIZE)  # Batch size to use

    # create a iterator of the correct shape and type
    iterator = train_dataset.make_one_shot_iterator()
    batch_features, batch_labels = iterator.get_next()
    return batch_features, batch_labels

classifier = tf.estimator.DNNClassifier(
   feature_columns=feature_columns, 
   hidden_units=[40, 60, 30, 12], 
   n_classes=2,
   model_dir="./") # Path to where checkpoints etc are stored

classifier.train(input_fn=lambda: my_input_fn(features, labels, True, 1))

但即使我尝试这个简单的数据集,我也会收到以下错误:

  

TypeError:预期的二进制或unicode字符串,得到24

我尝试更改feature_columns数组中列的顺序,但它不会更改任何内容。第2列被声明为数字,因此它应该至少需要一个int或一个浮点数。 我在这里缺少什么?

0 个答案:

没有答案