我使用了以下测试代码
import numpy as np
import csv
data = np.zeros((3,),dtype=("S24,int,float"))
with open("testtest.csv", 'w', newline='') as f:
writer = csv.writer(f,delimiter=',')
for row in data:
writer.writerow(row)
csv文件中的数据具有b''标记(字节文字标记),用于记录数组的字符串组件。 处理写入这些记录数组的csv的正确方法是什么,以及避免在我的csv文件中使用字节文字标记的最佳方法?
答案 0 :(得分:1)
我认为您正在使用Python3,它使用unicode作为默认字符串类型。字节字符串然后得到特殊的b
标记。
如果我使用unicode而不是字节生成数据,则可以:
In [654]: data1 = np.zeros((3,),dtype=("U24,int,float"))
In [655]: data1['f0']='xxx' # more interesting string field
In [656]: with open('test.csv','w') as f:
writer=csv.writer(f,delimiter=',')
for row in data1:
writer.writerow(row)
In [658]: cat test.csv
xxx,0,0.0
xxx,0,0.0
xxx,0,0.0
np.savetxt
做同样的事情:
In [668]: np.savetxt('test.csv',data1,fmt='%s',delimiter=',')
In [669]: cat test.csv
xxx,0,0.0
xxx,0,0.0
xxx,0,0.0
问题是,我可以在保持S24
字段的同时解决这个问题吗?例如,将文件打开为wb
?
我之前在https://stackoverflow.com/a/27513196/901925探讨了这个问题
Trying to strip b' ' from my Numpy array
看起来我的解决方案是decode
字节字段,或直接写入字节文件。由于您的数组混合了字符串和数字字段,因此decode
解决方案更加繁琐。
data1 = data.astype('U24,i,f') # convert bytestring field to unicode
辅助函数可以用于decode
字节字符串:
In [147]: fn = lambda row: [j.decode() if isinstance(j,bytes) else j for j in row]
In [148]: with open('test.csv','w') as f:
writer=csv.writer(f,delimiter=',')
for row in data:
writer.writerow(fn(row))
.....:
In [149]: cat test.csv
xxx,0,0.0
yyy,0,0.0
zzz,0,0.0
答案 1 :(得分:0)
您是否需要所有这三种dtypes中的数据?考虑在浮点数或整数的numpy数组上使用numpy.savetxt()。
http://docs.scipy.org/doc/numpy/reference/generated/numpy.savetxt.html
data = np.zeros((3,3))
filename='foo'
np.savetxt(filename+".csv",data,fmt='%1.6e',delimiter=",")
#fmt='%1.6e' controls how the numbers are written to the text file.
#E.g. use fmt='%d' for integers