Question

我有Python的示例文本，我正在处理。

  Afghanistan:32738376
  Akrotiri:15700
  Albania:3619778
  Algeria:33769669
  American Samoa:57496
  Andorra:72413
  Angola:12531357
  Anguilla:14108
  Antigua and Barbuda:69842
  Argentina:40677348
  Armenia:2968586
  Aruba:101541
  Australia:20600856
  Austria:8205533
  Azerbaijan:8177717

我有这个代码使用国家名称和人口来制作字典。

 dct = {}
  for line in infile:
    line = line.strip()
    words = line.split(":")
    countryname = words[0]

    population = int(words[1])
    dct[countryname] = population

当我打印填充时，它打印所有值，但后来我得到一个population = int（单词[1]） - IndexError：列表索引超出范围。我不明白我是如何得到这个错误的，特别是当我打印countryname时，它绝对没问题，错误只发生在人口中。 Python必须为两个变量访问相同数量的行，但似乎人口试图访问更多行，我不明白，因为它不会为countryname执行此操作。关于为什么会发生这种情况的任何想法。

Answer 1

您认为您的文件很完美，这是错误的。

try:
    countryname = words[0]
    population = int(words[1])
    dct[countryname] = population
except IndexError:
    print("Impossible convert line: %s " % line)

在这种情况下，我更喜欢使用日志而不是print语句，但为了示例，我认为没关系。如果需要，还应打印行号。

无论如何，try / except的目的是避免在文件不符合您的格式时破坏程序。

Answer 2

可能有没有分隔符:的行。试着抓住它

dct = {}
  for line in infile:
    line = line.strip()
    words = line.split(":")
    countryname = words[0]

    population = 0
    if words.__len__() > 1:
      population = int(words[1])

    dct[countryname] = population

Answer 3

请检查您的文件内容，看起来国家/地区名称与人口之间缺少“：”文件中的某个位置：

rfile = open('a.txt', 'rw')
print dict([line.strip().split(':')for line in rfile.readlines()])

Answer 4

我建议您将以下诊断添加到代码中：

dct = {}
for line_number, line in enumerate(infile):
    line = line.strip()
    words = line.split(":")

    if len(words) != 2:
        print "Line {} is not correctly formatted - {}".format(line_number, line)
    else:
        countryname = words[0]
        population = int(words[1])
        dct[countryname] = population

然后会显示数据中哪些行号有格式问题，它会显示如下内容：

Line 123 is not correctly formatted - Germany8205534
Line 1234 is not correctly formatted - Hungary8205535

Python - 索引超出范围错误

4 个答案: