Tensorflow不会训练:“ DataFrame”对象是可变的,因此无法进行哈希处理

时间:2019-07-14 20:12:09

标签: tensorflow training-data kaggle

我想在kaggle数据集“房屋价格”上构建和训练带有张量流的神经网络(但没有Keras,在Keras上我可以正常工作)。我使用Python,除了实际训练之外,我的代码运行良好。但是,在训练时,我没有得到任何错误(但没有得到训练),或者得到了TypeError:“ DataFrame”对象是可变的,因此无法进行散列。

我在ipynotebook的Google合作实验室上运行了脚本,我相信主要的问题是输入feed_dict。但是,我不知道这里出了什么问题。 batch_X包含100x10个功能,batch_Y具有100个标签。我想这可能是关键片段:

“ train_data = {X:batch_X,Y_:batch_Y}”

“ train_data”是我要输入的“ sess.run(train_step,feed_dict = train_data”。)

这是我的代码:https://colab.research.google.com/drive/1qabmzzicZVu7v72Be8kljM1pUaglb1bY

# train and train_normalized are the training data set (DataFrame)
# train_labels_normalized are the labels only

#Start session:
with tf.Session() as sess:
  sess.run(init)

  possible_indeces = list(range(0, train.shape[0]))
  iterations = 1000
  batch_size = 100

  for step in range(0, iterations):
    #draw batch indeces:
    batch_indeces = random.sample(possible_indeces, batch_size)
    #get features and respective labels
    batch_X = np.array(train_normalized.iloc[batch_indeces])
    batch_Y = np.array(train_labels_normalized.iloc[batch_indeces])

    train_data = { X: batch_X, Y_: batch_Y}

    sess.run(train_step, feed_dict=train_data)

我希望的是它将运行几分钟,并以优化的权重(每个隐藏的层分别包含48个节点的2个层)返回,从而使我能够进行预测。但是,它只是跳过上面的代码或抛出错误belo。

有人知道出了什么问题吗?

TypeError Traceback (most recent call last)
<ipython-input-536-79506f90a868> in <module>()
     13     batch_Y = p.array(train_labels_normalized.iloc[batch_indeces])
     14 
---> 15     train_data = { X: batch_X, Y_: batch_Y}
     16 
     17     sess.run(train_step, feed_dict=train_data)

  /usr/local/lib/python3.6/dist-packages/pandas/core/generic.py in __hash__(self)

   1814  def __hash__(self):
   1815  raise TypeError('{0!r} objects are mutable, thus they cannot be'
-> 1816     ' hashed'.format(self.__class__.__name__))
   1817
   1818     def __iter__(self):

  TypeError: 'DataFrame' objects are mutable, thus they cannot be hashed

1 个答案:

答案 0 :(得分:0)

问题出在您的第七(测试)步骤。

#Set X to the test data
X = test_normalized.astype(np.float32)
print(type(X)) # **<class 'pandas.core.frame.DataFrame'>**
Y1 = tf.nn.sigmoid(tf.matmul(X, W1))
Y2 = tf.nn.sigmoid(tf.matmul(Y1, W2))
Y3 = tf.matmul(Y2, W3)

您正在将X设置为DataFrame。在第一次运行时,这不会有任何影响。但是,当您运行第七步之后的第六步时,会遇到此问题,因为您已经覆盖了X的内容。

尝试将X更改为X_

X_ = test_normalized.astype(np.float32)
Y1 = tf.nn.sigmoid(tf.matmul(X_, W1))

此外,您的最终评估无效。将其放入tf.Session