以下代码是我编写的用于从网站读取数据并将其存储在列表中的代码。代码可以工作,但是它也会抛出一个超出范围错误的列表。谁能解释我做错了什么?
import urllib.request
data_url = "http://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data"
aboveFifty = 0
belowFifty = 0
""" The variables for storage """
age = 0
worksFor = ""
college = ""
salary = ""
bools = True
try:
print("Retrieving the data... ")
local_file, headers = urllib.request.urlretrieve(data_url)
print("Data retrieved")
fh = open(local_file, "r")
print("Reading the file... ")
for row in fh:
table = [row.strip().split(" ")]
salary = table[0][14]
if bools == True:
print("Table: ", table)
bools = False
if salary == "<=50K":
belowFifty += 1
elif salary == ">50K":
aboveFifty += 1
except IOError as e:
print("IO Error: ", e)
except IndexError as ie:
print("Index error: ", ie)
print("Above fifty: ", aboveFifty, "Below fifty: ", belowFifty)
fh.close()
我得到的追溯错误是:
Traceback (most recent call last):
File "C:\Users\Killian\workspace\College\Assignment.py", line 25, in <module>
salary = table[0][14]
IndexError: string index out of range
答案 0 :(得分:1)
您的数据已损坏。具体来说,数据文件末尾有一个空行。您可以像这样使用损坏的数据:
for row in fh:
table = [row.strip().split(" ")]
if not table:
continue # <-- ignore blank lines
salary = table[0][14]