Question

我是tensorflow的新手。我有一个具有连续，离散和分类值的数据集。样本数据如下：

     col1    col2    col3  col4  col5  col6  Class
0    22    23.40   45.60  11    1.0   0.0    0.0
1   346    67.40  235.60  23    1.0   1.0    0.0
2    22    67.34  364.66  17    0.0   0.0    1.0
3  1231   124.44  213.89  14    1.0   0.0    1.0

col1和col4是离散变量。 col2和col3是连续变量。 col5和col6是分类变量。 Class是目标变量。

我想知道我是否可以直接将上述数据作为输入传递给占位符X。

X = tf.placeholder(tf.float32, [None, numFeatures])

我不必申请tf.one_hot，对吗？由于我的分类变量是二进制的。

tensorflow如何检测col5和col6是分类变量？

任何帮助将不胜感激。谢谢！

Answer 1

由于您的变量是二进制的，因此可以将它们视为int 你必须通过传递批次来创建稍后在训练部分中使用的占位符。

以下是如何声明张量流占位符以便它们具有正确的dtype。

var1 = tf.placeholder(tf.int32, shape)
var4 = tf.placeholder(tf.int32, shape)

var2 = tf.placeholder(tf.float32, shape)
var3 = tf.placeholder(tf.float32, shape)

var5 = tf.placeholder(tf.int32, shape)
var6 = tf.placeholder(tf.int32, shape)

class_ = tf.placeholder(tf.int32, shape)

为了将变量集提供给模型，您稍后必须将它们连接起来，但在此之前，您应该强制使用张量，以使所有变量都在相同的dtypes中进行连接。

var1 = tf.cast(var1, tf.float32)
...
data = tf.concat([var1,var4, var2,var3, var5, var6], axis=1)

如何将分类，离散和连续数据作为张量流中的输入混合？

1 个答案: