Question

我无法使用Numpy（Python 3）将字符串转换为Float：

事实证明，在Python3中，默认是＆＃34;字符串＆＃34;是unicode。前缀为＆＃34;字符串＆＃34;的b表示解释器认为这些是字节。

参考：Numpy: Creating a Vector through Array Comparison is NOT working

链接到物理文件：http://www.filedropper.com/recordedalcoholpercapitaconsumption19801999

我尝试过：

import numpy
file = "/home/ds/notebooks/data/fivethirtyeight_data/who/ \   
Recorded_Alcohol_Per_Capita_Consumption_1980_1999.csv"
world_alcohol = numpy.genfromtxt(file, delimiter=",", dtype=numpy.string_, skip_header=1)
print(world_alcohol)
print(world_alcohol.astype(float))

for：

array([[b'"1.0"',..., b'"2.0"'],
   [b'"3.0"',..., b'"3.0"']], 
  dtype='|S5')


ValueError: could not convert string to float: '"1.0"'

Answer 1

您的数据似乎已落后";您可以使用转换器选项去除它们：

def converter(x): return float(x.decode().strip('"')) # converter(b'"23"') ->23.0
alc=numpy.genfromtxt(file, delimiter=",",skip_header=1,\
converters={n:converter for n in range(12)})

修改

问题在于它不是同类数字文件，有些列是文本等。

您可以使用以下字符获取数字列：

a=np.genfromtxt('./desktop/alc.csv',delimiter=';',skip_header=1,usecols=(3,7,8,9,10))

但我认为这里最好的方法是使用pandas，更好地使用这些数据：

df=pandas.read_csv('./desktop/alc.csv',sep=';')

并处理它。

Numpy：无法从文件中读取浮点数据

1 个答案: