我有一个这样的文本文件:
"-3.588920831680E-02","1.601887196302E-01","1.302309112549E+02"
"3.739478886127E-01","1.782759875059E-01","6.490543365479E+01"
"3.298096954823E-01","6.939357519150E-02","2.112392578125E+02"
"-2.319437451661E-02","1.149862855673E-01","2.712340698242E+02"
"-1.015115305781E-01","-1.082316488028E-01","6.532022094727E+01"
"-5.374089814723E-03","1.031072884798E-01","5.510117187500E+02"
"6.748274713755E-02","1.679160743952E-01","4.033969116211E+02"
"1.027429699898E-01","1.379162818193E-02","2.374352874756E+02"
"-1.371455192566E-01","1.483036130667E-01","2.703260498047E+02"
"NULL","NULL","NULL"
"3.968210220337E-01","1.893606968224E-02","2.803018188477E+01"
我尝试使用numpy读取此文本文件:
dat = np.genfromtxt('data.txt',delimiter=',',dtype='str')
print("dat = {}".format(dat))
# now when I try to convert to float
dat = dat.astype(np.float) # it fails
# try to make it float
dat = np.char.strip(dat, '"').astype(float)
File "test.py", line 25, in <module> dat = dat.astype(np.float) # it fails ValueError: could not convert string to float: '"-3.588920831680E-02"'
如何解决此错误?
相关链接:
https://docs.scipy.org/doc/numpy/reference/generated/numpy.genfromtxt.html#numpy.genfromtxt
答案 0 :(得分:2)
您可以使用csv
模块直接读取该文件,如:
import csv
import numpy as np
reader = csv.reader(open('file1'), delimiter=",")
data = np.array([[float(i) if i != 'NULL' else np.nan for i in row]
for row in reader])
print(data)
[[ -3.58892083e-02 1.60188720e-01 1.30230911e+02]
[ 3.73947889e-01 1.78275988e-01 6.49054337e+01]
[ 3.29809695e-01 6.93935752e-02 2.11239258e+02]
[ -2.31943745e-02 1.14986286e-01 2.71234070e+02]
[ -1.01511531e-01 -1.08231649e-01 6.53202209e+01]
[ -5.37408981e-03 1.03107288e-01 5.51011719e+02]
[ 6.74827471e-02 1.67916074e-01 4.03396912e+02]
[ 1.02742970e-01 1.37916282e-02 2.37435287e+02]
[ -1.37145519e-01 1.48303613e-01 2.70326050e+02]
[ nan nan nan]
[ 3.96821022e-01 1.89360697e-02 2.80301819e+01]]
答案 1 :(得分:-1)
问题是你的浮点数被2个引号括起来而不是1.Numpy希望你的数组有像
这样的字符串 ' "1.45E-02" '
相反,你有像
这样的东西 dat_new = np.char.replace(dat,'"','')
dat_new = np.char.replace(dat_new,'NULL','0') #You also need to do something
#with NULL. Here I am just replacing it with 0.
dat_new = dat_new.astype(float)
(注意开头和结尾的额外双引号)。
因此解决这个问题的方法就是删除那些额外的双引号,这可以很容易地完成,如下所示:
np.char.replace(np_array,string_to_replace,replacement)
{{1}}基本上可以作为&#39;查找和替换&#39;并用第三个参数替换第二个参数的每个实例。