Question

我尝试使用github中的MLPClassifier：

https://github.com/aymericdamien/TensorFlow-Examples/blob/master/examples/3_NeuralNetworks/multilayer_perceptron.py

但实际上我并不知道如何将它与我自己的数据一起使用。我有一个尺寸为20000x100的特征矩阵X和一个尺寸为20000的5个类别的目标矢量y。

X和y保存在numpy数组中。我对此感到困惑的是：

x = tf.placeholder("float", [None, n_input]) #n_input is 100 here, right?
y = tf.placeholder("float", [None, n_classes])


total_batch = int(mnist.train.num_examples/batch_size) #What is that for my data?


batch_x, batch_y = mnist.train.next_batch(batch_size)#what are these?

Answer 1

如果你看一下你的变量batch_x，你会发现它只是一个形状[batch_size, 784]的凹凸不平的数组，所以batch_size展平的图像和batch_y是一个形状[batch_size, 10]的数组，因此batch_x

中的每个图像都有1个单热编码标签

因此，如果您想在此模型中使用自己的数据，则必须：

要么以相同的方式格式化您的数据（[batch_size, 784]和[batch_size, 10]）
或更改占位符x和y，以便它们可以采用您自己数据的形状

在您的情况下，然后只需使用以下命令更改代码：

n_input = 100
n_classes = 5
total_batch = int(20000/batch_size)

最好不要使用上面的数字，而是从数据中获取这些值：

n_input = your_x_data.shape[1]
n_classes = your_y_data.shape[1]
# or n_classes = your_y_data.max(axis=1) 
# if your y data array is not already one-hot encoded
total_batch = int(your_x_data.shape[0]/batch_size)

Answer 2

除了@ted的答案之外，您还必须修改total_batch的计算。 total_batch是您的网络将生成的批次数。假设X包含您的数据，您必须将int(mnist.train.num_examples/batch_size)替换为int(20000/batch_size)或int(X.shape[0]/batch_size)。您可以在那里选择批量大小，例如200.

使用MLPClassifier Tensorflow

2 个答案: