Question

我在tensorflow中创建了一个神经网络，我创建了这样的占位符：

input_tensor = tf.placeholder(tf.float32, shape = (None,n_input), name = "input_tensor")
output_tensor = tf.placeholder(tf.float32, shape = (None,n_classes), name = "output_tensor")

在培训过程中，我收到以下错误：

Traceback (most recent call last):
  File "try.py", line 150, in <module>
    sess.run(optimizer, feed_dict={X: x_train[i: i + 1], Y: y_train[i: i + 1]})
TypeError: unhashable type: 'numpy.ndarray'

我发现这是因为我的x_train和y_train的数据类型与占位符的数据类型不同。

我的x_train看起来有点像这样：

array([[array([[ 1.,  0.,  0.],
   [ 0.,  1.,  0.]])],
   [array([[ 0.,  1.,  0.],
   [ 1.,  0.,  0.]])],
   [array([[ 0.,  0.,  1.],
   [ 0.,  1.,  0.]])]], dtype=object)

最初这是一个像这样的数据框：

0  [[1.0, 0.0, 0.0], [0.0, 1.0, 0.0]]
1  [[0.0, 1.0, 0.0], [1.0, 0.0, 0.0]]
2  [[0.0, 0.0, 1.0], [0.0, 1.0, 0.0]]

我做了x_train = train_x.values来获取numpy数组

y_train看起来：

array([[ 1.,  0.,  0.],
   [ 0.,  1.,  0.],
   [ 0.,  0.,  1.]])

x_train有dtype对象，y_train有dtype float64。

我想知道的是我如何更改训练数据的数据类型，以便它可以与tensorflow占位符一起使用。或者请建议我是否遗漏了什么。

Answer 1

您的x_train是一个包含数组的嵌套对象，因此您必须将其解压缩并重新整形。这是一个通用黑客：

def unpack(a, aggregate=[]):
    for x in a:
        if type(x) is float:
            aggregate.append(x)
        else:
            unpack(x, aggregate=aggregate)
    return np.array(aggregate)
x_train = unpack(x_train.values).reshape(x_train.shape[0],-1)

一旦你有一个密集的数组（y_train已经很密集），你可以使用如下函数：

def cast(placeholder, array):
    dtype = placeholder.dtype.as_numpy_dtype
    return array.astype(dtype)

x_train, y_train = cast(X,x_train), cast(Y,y_train)

Answer 2

猜测你想要数据的形状有点难，但我猜你可能正在寻找的两种组合中的一种。我还将尝试在Pandas数据帧中模拟您的数据。

df = pd.DataFrame([[[[1.0, 0.0, 0.0], [0.0, 1.0, 0.0]]], 
[[[0.0, 1.0, 0.0], [1.0, 0.0, 0.0]]],
[[[0.0, 0.0, 1.0], [0.0, 1.0, 0.0]]]], columns = ['Mydata'])
print(df)

x = df.Mydata.values
print(x.shape)
print(x)
print(x.dtype)

输出：

                               Mydata
0  [[1.0, 0.0, 0.0], [0.0, 1.0, 0.0]]
1  [[0.0, 1.0, 0.0], [1.0, 0.0, 0.0]]
2  [[0.0, 0.0, 1.0], [0.0, 1.0, 0.0]]

(3,)
[list([[1.0, 0.0, 0.0], [0.0, 1.0, 0.0]])
 list([[0.0, 1.0, 0.0], [1.0, 0.0, 0.0]])
 list([[0.0, 0.0, 1.0], [0.0, 1.0, 0.0]])]
object

组合1

y = [item for sub_list in x for item in sub_list]
y = np.array(y, dtype = np.float32)
print(y.dtype, y.shape)
print(y)

输出：

float32 (6, 3)
[[ 1.  0.  0.]
 [ 0.  1.  0.]
 [ 0.  1.  0.]
 [ 1.  0.  0.]
 [ 0.  0.  1.]
 [ 0.  1.  0.]]

组合2

y = [sub_list for sub_list in x]
y = np.array(y, dtype = np.float32)
print(y.dtype, y.shape)
print(y)

输出：

float32 (3, 2, 3)
[[[ 1.  0.  0.]
  [ 0.  1.  0.]]

 [[ 0.  1.  0.]
  [ 1.  0.  0.]]

 [[ 0.  0.  1.]
  [ 0.  1.  0.]]]

如何为tensorflow更改numpy数组的dtypes

2 个答案: