python - numpy loadtxt,定义特定列的类型

时间:2017-02-03 15:33:06

标签: python numpy types

我遇到python 3.x的问题 众所周知,Python3将字节字符串读取为:b'yourString'

我的问题是我想要读取一个文本文件,逗号分隔为四列,应该是字符串,另一列是intfloat

我知道:

data_files=np.genfromtxt(i, names=True, dtype=None, delimiter=",")

我想做一些(我知道这不起作用):

data_files=np.genfromtxt(i, names=True, dtype='None,str', delimiter=",",usecols=(0,2,3,4,5,6,8,9,11,12,14,15,16,17,18,19,20,21,22)(1,7,10,13))

我尝试过:

alttype = np.dtype('f','s2','i2','i2','f0','f0','f0','s1','f0','f0','s1','f0','f0','s1','f0','f0','f0','f0','f0','f0','f0','f0','f0')

但这仅限于四组。由于我之后对数字进行了操作,我无法将它们全部读作str

非常感谢任何帮助。

1 个答案:

答案 0 :(得分:0)

In [365]: alttype = np.dtype('f','s2','i2','i2','f0','f0','f0','s1','f0','f0','s1','f0','f0','s1','f0','f0','f0','f0','f0','f0','f0','f0','f0')
...
TypeError: function takes at most 4 arguments (23 given)

dtype的参数可以有多种形式,一个字符串,一个元组列表,几个字典。它反对参数的性质,而不是您尝试定义的字段数。 dtype的{​​{1}}也更灵活。这种格式经常使用:genfromtxt

所以dtype="i8,f8,S5"可能有效。请勿使用alttype = 'f,S2,i,i,f,f,f,S1,...'

定义一个小样本'文件'(f0接受行列表):

genfromtxt

默认加载:

In [382]: txt=b"""A B C D E F G
     ...: a 23 2.1 c 2 3 7
     ...: b 22 2.0 d 2 5 8
     ...: """

只加载字符串列

In [384]: np.genfromtxt(txt.splitlines(),names=True,dtype=None)
Out[384]: 
array([(b'a', 23,  2.1, b'c', 2, 3, 7), 
       (b'b', 22,  2. , b'd', 2, 5, 8)], 
      dtype=[('A', 'S1'), ('B', '<i4'), ('C', '<f8'), ('D', 'S1'), ('E', '<i4'), ('F', '<i4'), ('G', '<i4')])

拼出所选列的dtype

In [385]: np.genfromtxt(txt.splitlines(),names=True,dtype=None, usecols=[0,3])
Out[385]: 
array([(b'a', b'c'), (b'b', b'd')], 
      dtype=[('A', 'S1'), ('D', 'S1')])