Question

我正在使用python csv编写器通过以下方式保存矩阵：

def write_to_disk(csv_path, mtx_norm, cell_ids, gene_symbols):
    print('writing the results to disk')
    with open(csv_path,'w', encoding='utf8') as csvfile:
        writer = csv.writer(csvfile, delimiter=',')
        writer.writerow(["", cell_ids])
        for idx, row in enumerate(mtx_norm):
            writer.writerow([gene_symbols[idx], row])

我在矩阵中有很多零，csv writer的作用是缩小所有具有相似数字（在这种情况下为零）的空间，仅保存...个字符。因此，将其保存为一堆具有各种长度的数组。然后，我在打开和使用它时遇到了麻烦。我可以通过以下方式打开未签订合同的csv：

data = np.genfromtxt(open(path_to_data, "r"), delimiter=",")

但不是csv writer个文件保存的文件。有没有一种方法可以避免这种收缩和/或打开两种类型的csv文件，将它们转换为一种格式-numpy 2D array而没有这些...项？

Answer 1

如果使用numpy数组，则应考虑使用numpy.savetxt()函数而不是https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.savetxt.html。例如：

import numpy as np

a = np.random.randint(0, 10, (10, 10), dtype=int)
a[1:5, 1:8] = 0
np.savetxt('1.txt', a, fmt='%d', delimiter=',')

文件内容：

0,8,5,8,0,7,5,8,0,9
0,0,0,0,0,0,0,0,3,4
5,0,0,0,0,0,0,0,7,3
9,0,0,0,0,0,0,0,7,5
7,0,0,0,0,0,0,0,6,9
9,9,9,9,2,7,5,0,0,7
4,6,9,0,7,5,2,4,7,5
2,5,1,9,4,9,3,5,3,7
3,3,6,8,5,7,5,8,5,5
9,4,1,2,0,9,2,2,8,2

您可以使用numpy.loadtxt() https://docs.scipy.org/doc/numpy/reference/generated/numpy.loadtxt.html加载数据：

a = np.loadtxt('1.txt', delimiter=',', dtype=int)

然后a是：

array([[0, 8, 5, 8, 0, 7, 5, 8, 0, 9],
       [0, 0, 0, 0, 0, 0, 0, 0, 3, 4],
       [5, 0, 0, 0, 0, 0, 0, 0, 7, 3],
       [9, 0, 0, 0, 0, 0, 0, 0, 7, 5],
       [7, 0, 0, 0, 0, 0, 0, 0, 6, 9],
       [9, 9, 9, 9, 2, 7, 5, 0, 0, 7],
       [4, 6, 9, 0, 7, 5, 2, 4, 7, 5],
       [2, 5, 1, 9, 4, 9, 3, 5, 3, 7],
       [3, 3, 6, 8, 5, 7, 5, 8, 5, 5],
       [9, 4, 1, 2, 0, 9, 2, 2, 8, 2]])

保存带有多个零的矩阵时，CSV编写器会忽略很多项

1 个答案: