Numpy recarray将字节文字标签写入我的csv文件?

时间:2015-09-18 20:50:28

标签: python csv numpy literals

我使用了以下测试代码

import numpy as np
import csv

data = np.zeros((3,),dtype=("S24,int,float"))
with open("testtest.csv", 'w', newline='') as f:
    writer = csv.writer(f,delimiter=',')
    for row in data:
        writer.writerow(row)

csv文件中的数据具有b''标记(字节文字标记),用于记录数组的字符串组件。 处理写入这些记录数组的csv的正确方法是什么,以及避免在我的csv文件中使用字节文字标记的最佳方法?

2 个答案:

答案 0 :(得分:1)

我认为您正在使用Python3,它使用unicode作为默认字符串类型。字节字符串然后得到特殊的b标记。

如果我使用unicode而不是字节生成数据,则可以:

In [654]: data1 = np.zeros((3,),dtype=("U24,int,float"))
In [655]: data1['f0']='xxx'  # more interesting string field
In [656]: with open('test.csv','w') as f:
    writer=csv.writer(f,delimiter=',')
    for row in data1:
        writer.writerow(row)
In [658]: cat test.csv
xxx,0,0.0
xxx,0,0.0
xxx,0,0.0

np.savetxt做同样的事情:

In [668]: np.savetxt('test.csv',data1,fmt='%s',delimiter=',')
In [669]: cat test.csv
xxx,0,0.0
xxx,0,0.0
xxx,0,0.0

问题是,我可以在保持S24字段的同时解决这个问题吗?例如,将文件打开为wb

我之前在https://stackoverflow.com/a/27513196/901925探讨了这个问题 Trying to strip b' ' from my Numpy array

看起来我的解决方案是decode字节字段,或直接写入字节文件。由于您的数组混合了字符串和数字字段,因此decode解决方案更加繁琐。

data1 = data.astype('U24,i,f') # convert bytestring field to unicode

辅助函数可以用于decode字节字符串:

In [147]: fn = lambda row: [j.decode() if isinstance(j,bytes) else j for j in row]
In [148]: with open('test.csv','w') as f:
    writer=csv.writer(f,delimiter=',')
    for row in data:
        writer.writerow(fn(row))
   .....:         
In [149]: cat test.csv
xxx,0,0.0
yyy,0,0.0
zzz,0,0.0

答案 1 :(得分:0)

您是否需要所有这三种dtypes中的数据?考虑在浮点数或整数的numpy数组上使用numpy.savetxt()。

http://docs.scipy.org/doc/numpy/reference/generated/numpy.savetxt.html

data = np.zeros((3,3))
filename='foo'
np.savetxt(filename+".csv",data,fmt='%1.6e',delimiter=",")
#fmt='%1.6e' controls how the numbers are written to the text file. 
#E.g. use fmt='%d' for integers