Question

我正在尝试从在线CSV文件中获取一些数据并从中创建一个表。我使用splitlines（）来隔离每一位数据，但我一直得到一个ValueError：

ValueError: invalid literal for int() with base 10: 'Year'

这是我的代码：

import csv
import urllib.request

url = "https://raw.github.com/datasets/gdp/master/data/gdp.csv"
webpage = urllib.request.urlopen(url)
datareader = csv.reader(webpage.read().decode('utf-8').splitlines())
dataList = []
NewTable = []
print('done')
for row in datareader:
    ##print(row)
    countryName, countryCode, Year, Value= row
    print(Year)
    Year = int(Year)
    ##Value = float(Value)
    rowTuple = countryName, countryCode, Year, Value
    dataList.append(rowTuple)

当我取消注释“print（Year）”时，我得到一个整数列表。 1960-2012和我之间的所有数字都无法弄清楚它为什么不接受从字符串到整数的转换。

有什么想法吗？

Answer 1

CSV中的第一行是标题行，而不是数据行：

Country Name,Country Code,Year,Value

跳过：

datareader = csv.reader(webpage.read().decode('utf-8').splitlines())
next(datareader, None)  # skip the header

您可以使用io.TextIOWrapper() object为您解密UTF-8网页：

import io

webpage = urllib.request.urlopen(url)
datareader = csv.reader(io.TextIOWrapper(webpage, 'utf-8'))
next(datareader, None)  # skip the header

在Splitlines（）之后将ValueError转换为字符串

1 个答案: