Question

我已经待了好几个小时了，一直到处都没看清楚我的具体问题。

因此，我正在为强化学习模型构建训练集，我想将该训练集保存到csv文件中。训练集的每条记录都采用以下格式：

[np.ndarray(shape=(18,8,8)), np.ndarray(shape=(1968,)), int64]

神经网络的输入是18x8x8张量/ numpy数组，输出是平面数组（1968）策略+整数值。

当我将其写入csv文件时，我在每个记录的输入和策略元素上使用以下numpy函数：

    input_bytes = inputs.tobytes() # inputs.tostring() also works
    policy_bytes = policy.tobytes() # same here, policy.tostring()

训练时间到了，我需要从csv文件中读取这些列，然后将字节变回 numpy.ndarray对象。我知道原始的数据类型和形状-输入的np.int32，(18,8,8)和策略的np.float64，(1968,)。因此，您认为我可以简单地使用：

    # need to use *_bytes[1:] because the 'b' character is written
    # to the csv when we save
    inputs = np.reshape(np.fromstring(input_bytes[1:], dtype=np.int32), (18,8,8))
    policy = np.fromstring(policy_bytes[1:], dtype=np.float64)

此操作失败，并显示错误消息：

ValueError: string size must be a multiple of element size

如果我尝试将字符串转换为字节，并使用frombuffer，即

    inputs = np.reshape(np.frombuffer(bytes(input_bytes[1:], 'utf-8'), dtype=np.int32), (18,8,8))
    policy = np.frombuffer(bytes(policy_bytes[1:], 'utf-8'), dtype=np.float64)

我基本上得到了完全相同的错误。

ValueError: buffer size must be a multiple of element size

我意识到这一定是某种编码问题，但是我无法确切指出。请注意，以下内容可以完美运行：

test_input = np.zeros(shape=(18,8,8), dtype=np.int32).tobytes()
test_policy = np.zeros(shape=(1968,), dtype=np.float64).tobytes()

np.reshape(np.frombuffer(test_input, dtype=np.int32), (18,8,8))
np.frombuffer(test_policy, dtype=np.float64)

如何将字节写到csv文件中，并在以后通过读取文件的方式将它们加载回ndarray对象中？

编辑：这是csv的示例：

numpy frombuffer（）和tostring（）-从csv中读取

0 个答案: