我有一些代码在Python 2.7中运行良好,使用numpy的loadtxt函数将csv文件读入numpy数组。该文件可以看here。我用这个命令
inp = numpy.loadtxt(filename, dtype=str, delimiter=',',skiprows=1
有了这个,我在python 2.7
中得到了这个array([['BKNIF', '01-Jan-2014', '11418.9', '11432.55', '11361', '11385.6',
'0'],
['BSESN', '01-Jan-2014', '21222.19', '21244.35', '21133.82',
'21140.48', '0'],
['DXY', '01-Jan-2014', '80.21', '80.24', '80.16', '80.19', '0'],
['FBV', '01-Jan-2014', '0', '0', '0', '0', '0']],
dtype='|S11')
但是,使用python 3.3,我得到了
array([["b'BKNIF'", "b'01-Jan-2014'", "b'11418.9'", "b'11432.55'",
"b'11361'", "b'11385.6'", "b'0'"],
["b'BSESN'", "b'01-Jan-2014'", "b'21222.19'", "b'21244.35'",
"b'21133.82'", "b'21140.48'", "b'0'"],
["b'DXY'", "b'01-Jan-2014'", "b'80.21'", "b'80.24'", "b'80.16'",
"b'80.19'", "b'0'"],
["b'FBV'", "b'01-Jan-2014'", "b'0'", "b'0'", "b'0'", "b'0'", "b'0'"]],
dtype='<U14')
请注意导入如何在每个项目周围插入双引号,并在前面插入b。它显然也决定以不同的方式编码。即使我使用dtype='|S11'
代替dtype=str
,我也会遇到相同的行为。
请不要评论为什么我使用numpy loadtxt,或者你认为我对loadtxt的使用效率低下。现在,我需要帮助找出行为改变的原因,以及如何解决它。感谢。
答案 0 :(得分:1)
In [20]: m=loadtxt(fname, dtype='S20', delimiter=',', skiprows=1)
In [21]: m.astype(str)
Out[21]:
array([['BKNIF', '01-Jan-2014', '11418.9', '11432.55', '11361', '11385.6',
'0'],
['BSESN', '01-Jan-2014', '21222.19', '21244.35', '21133.82',
'21140.48', '0'],
['DXY', '01-Jan-2014', '80.21', '80.24', '80.16', '80.19', '0'],
['FBV', '01-Jan-2014', '0', '0', '0', '0', '0'],
['NSEI', '01-Jan-2014', '6323.8', '6327.2', '6298.25', '6301.65',
'0'],
['NVOT', '01-Jan-2014', '30783.764', '2313498.5', '30783.764',
'2313498.5', '0'],
['RUI', '01-Jan-2014', '1027.14', '1030.97', '1027.14', '1030.364',
'0'],
['RUT', '01-Jan-2014', '1160.64', '1165.64', '1160.64', '1163.637',
'0'],
['SENSEX', '01-Jan-2014', '21222.19', '21244.35', '21133.82',
'21140.48', '0']],
dtype='<U20')
但元素仍为numpy.bytes_
:
m[0][0]
Out[22]: b'BKNIF'
type(m[0][0])
Out[23]: numpy.bytes_
我觉得它看起来不漂亮?