Question

我一直在和Torch合作。我目前的程序需要导出包含简化特征矩阵的Tensor。我尝试了以下操作：

torch.save('t.csv',torch.Tensor({{1,2},{3,4}}),'ascii')

，输出结果为：

4
1
3
V 1
18
torch.DoubleTensor
2
2 3
3 1
1
4
2
3
V 1
19
torch.DoubleStorage
6
1 2 3 4 5 6

预期产出：

1, 2, 3
4, 5, 6

我希望有人知道如何做到这一点？

Answer 1

当保存张量时，火炬不仅可以保存数据，还可以保存 - 如您所见 - 还有其他一些有用的信息供以后反序列化。

如果你需要csv序列化，你最好自己实现它。

幸运的是，这非常简单。

这是一个简单的例子：

require 'torch'

matrix = torch.Tensor(5,3) -- a 5x3 matrix

matrix:random(1,10) -- matrix initialized with random numbers in [1,10]

print(matrix) -- let's see the matrix content

subtensor = matrix[{{1,3}, {2,3}}] -- let's create a view on the row 1 to 3, for which we take columns 2 to 3 (the view is a 3x2 matrix, note that values are bound to the original tensor)

local out = assert(io.open("./dump.csv", "w")) -- open a file for serialization

splitter = ","
for i=1,subtensor:size(1) do
    for j=1,subtensor:size(2) do
        out:write(subtensor[i][j])
        if j == subtensor:size(2) then
            out:write("\n")
        else
            out:write(splitter)
        end
    end
end

out:close()

我的计算机上矩阵的输出是：

 10  10   6
  4   8   3
  3   8   5
  5   5   5
  1   6   8
[torch.DoubleTensor of size 5x3]

和文件转储内容：

10,6
8,3
8,5

HTH

Answer 2

您可以先使用torch.totable将张量转换为Lua表。然后使用csvigo库将表保存为csv文件。这可能是一种解决方法，但我没有遇到任何问题。

Answer 3

对于简单的表，您还可以通过将张量转换为 Numpy 数组，然后转换为 Pandas 数据框来导出。

import pytorch as torch
import numpy as np
import pandas as pd

t = torch.tensor([[1,2],[3,4]]) #dummy data

t_np = t.numpy() #convert to Numpy array
df = pd.DataFrame(t_np) #convert to a dataframe
df.to_csv("testfile",index=False) #save to file

#Then, to reload:
df = pd.read_csv("testfile")

火炬：将张量保存到csv文件

3 个答案: