TensorFlow InternalError:无法将元素作为字节获取

时间:2018-03-26 20:59:27

标签: python tensorflow jupyter-notebook

我正在尝试在包含分类和数字数据混合的一些日志数据上运行带有TensorFlow的DNNClassifier。我创建了功能列来指定和散列/散列数据以进行张量流。当我运行代码时,我收到'无法获取元素为字节'内部错误。注意:我不想按照此article中的说明删除Nan值,因此我使用此代码UITableView将它们转换为0,因此我不确定为什么我仍然会遇到此错误。如果我dropna然后它工作但我不想放弃Nan的,因为我觉得他们需要模型训练。

train = train.fillna(0, axis=0)

然后我收到此错误:

def create_train_input_fn(): 
    return tf.estimator.inputs.pandas_input_fn(
        x=train,
        y=train_label, 
        batch_size=32,
        num_epochs=None,
        shuffle=True)

def create_test_input_fn():
    return tf.estimator.inputs.pandas_input_fn(
        x=valid,
        y=valid_label, 
        num_epochs=1,
        shuffle=False)
feature_columns = []
end_time = tf.feature_column.embedding_column(tf.feature_column.categorical_column_with_hash_bucket('end_time', 1000), 10)
feature_columns.append(end_time)
device = tf.feature_column.embedding_column(tf.feature_column.categorical_column_with_hash_bucket('device', 1000), 10)
feature_columns.append(device)
device_os = tf.feature_column.embedding_column(tf.feature_column.categorical_column_with_hash_bucket('device_os', 1000), 10)
feature_columns.append(device_os)
device_os_version = tf.feature_column.embedding_column(tf.feature_column.categorical_column_with_hash_bucket('device_os_version', 1000), 10)
feature_columns.append(device_os_version)
Latency = tf.feature_column.bucketized_column(
    tf.feature_column.numeric_column('Latency'), 
    boundaries=[.000000, .000010, .000100, .001000, .010000, .100000])
feature_columns.append(Latency)
Megacycles = tf.feature_column.bucketized_column(
    tf.feature_column.numeric_column('Megacycles'), 
    boundaries=[0, 50, 100, 200, 300])
feature_columns.append(Megacycles)
Cost = tf.feature_column.bucketized_column(
    tf.feature_column.numeric_column('Cost'), 
    boundaries=[0.000001e-08, 1.000000e-08, 5.000000e-08, 10.000000e-08, 15.000000e-08 ])
feature_columns.append(Cost)
device_brand = tf.feature_column.embedding_column(tf.feature_column.categorical_column_with_hash_bucket('device_brand', 1000), 10)
feature_columns.append(device_brand)
device_family = tf.feature_column.embedding_column(tf.feature_column.categorical_column_with_hash_bucket('device_family', 1000), 10)
feature_columns.append(device_family)
browser_version = tf.feature_column.embedding_column(tf.feature_column.categorical_column_with_hash_bucket('browser_version', 1000), 10)
feature_columns.append(browser_version)
app = tf.feature_column.embedding_column(tf.feature_column.categorical_column_with_hash_bucket('app', 1000), 10)
feature_columns.append(app)
ua_parse = tf.feature_column.embedding_column(tf.feature_column.categorical_column_with_hash_bucket('ua_parse', 1000), 10)
feature_columns.append(ua_parse)

estimator = tf.estimator.DNNClassifier(hidden_units=[256, 128, 64], 
                                       feature_columns=feature_columns, 
                                       n_classes=2, 
                                       model_dir='graphs/dnn')

train_input_fn = create_train_input_fn()
estimator.train(train_input_fn, steps=2000)

1 个答案:

答案 0 :(得分:0)

我同意Thomas Decaux的观点。我遇到了完全相同的问题。我检查了标签是否用字符串(“是”和“否”)而不是整数(1,0)表示。将标签转换为int64后,没有出现此类错误。