在读取带有np.genfromtxt的CSV文件时遇到问题。 CSV中的所有记录均采用科学计数法,但是在使用np.genfromtxt读取文件时,数组中的每个项目均为“ nan”。
CSV中的示例行:1.02E + 02; 1.64E + 00
In [1]: read = np.genfromtxt('13G-mapa-0001.CSV', delimiter=';')
In [2]: read
Out[2]:
array([[nan, nan],
[nan, nan],
[nan, nan],
...,
[nan, nan],
[nan, nan],
[nan, nan]])
完整文件:
1,204619e+002;1,639486e+000
1,214262e+002;1,623145e+000
1,223904e+002;1,607553e+000
1,233547e+002;1,592153e+000
1,243189e+002;1,576472e+000
1,252832e+002;1,560220e+000
1,262474e+002;1,543355e+000
1,272117e+002;1,526069e+000
1,281759e+002;1,508706e+000
1,291402e+002;1,491635e+000
1,301044e+002;1,475144e+000
1,310686e+002;1,459387e+000
1,320329e+002;1,444416e+000
答案 0 :(得分:1)
您的分隔符必须是逗号',而不是分号';'
编辑:问题是也有逗号,例如1,25e + 00需要单独解析
CRITICAL
这是我的解决方法
答案 1 :(得分:1)
基于this answer,您可以执行以下操作来转换逗号十进制:
def conv(x):
return x.replace(',', '.').encode()
read = np.genfromtxt((conv(x) for x in open("x.csv")), delimiter=';')
>>> read
array([[120.4619 , 1.639486],
[121.4262 , 1.623145],
[122.3904 , 1.607553],
[123.3547 , 1.592153],
[124.3189 , 1.576472],
[125.2832 , 1.56022 ],
[126.2474 , 1.543355],
[127.2117 , 1.526069],
[128.1759 , 1.508706],
[129.1402 , 1.491635],
[130.1044 , 1.475144],
[131.0686 , 1.459387],
[132.0329 , 1.444416]])
答案 2 :(得分:1)
pandas提供了一种现代化,快速且通用的方法:
import pandas as pd
table=pd.read_csv('data.csv',sep=';',decimal=',',header=None)
arr=table.values
对于
array([[ 120.4619 , 1.639486],
[ 121.4262 , 1.623145],
[ 122.3904 , 1.607553],
[ 123.3547 , 1.592153],
[ 124.3189 , 1.576472],
[ 125.2832 , 1.56022 ],
[ 126.2474 , 1.543355],
[ 127.2117 , 1.526069],
[ 128.1759 , 1.508706],
[ 129.1402 , 1.491635],
[ 130.1044 , 1.475144],
[ 131.0686 , 1.459387],
[ 132.0329 , 1.444416]])
read_csv
比genfromtxt
提供更多高级选项。