Question

Traceback (most recent call last):
  File "C:\Python34\Project\wordData.py", line 60, in <module>
    main()
  File "C:\Python34\Project\wordData.py", line 58, in main
    print(totalOccurences(word, readWordFile(fileName)))
  File "C:\Python34\Project\wordData.py", line 31, in readWordFile
    yc = createYearCount(int(new[1]), int(new[2]))
IndexError: list index out of range

我试图测试我的功能。使用示例文件，readWordFile应该返回：

{’airport’: [YearCount( year=2007, count=175702 ), YearCount( year=2008,
count=173294 )], ’wandered’: [YearCount( year=2005, count=83769 ),
YearCount( year=2006, count=87688 ), YearCount( year=2007, count=108634 ),
YearCount( year=2008, count=171015 )], ’request’: [YearCount( year=2005,
count=646179 ), YearCount( year=2006, count=677820 ), YearCount( year=2007,
count=697645 ), YearCount( year=2008, count=795265 )]}

和totalOccurences应该采用单词（单词搜索）和单词（将单词映射到YearCount对象列表的字典）

离。

print(totalOccurences('wandered', readWordFile(fileName)))
451106

完整代码：

class YearCount(rit_object):
    """
    Year count object taking the year and count as slots
    """
    __slots__ = ( 'year', 'count')
    _types = (int, int)

def createYearCount(year, count):
    return YearCount(year, count)

def readWordFile(fileName):
    #read in the entire unigram dataset
    """
    A dictionary mapping words to lists of YearCount objects.
    For every word, there is exactly one list of YearCount objects.
    Each YearCount object contains a year in which a
    word appeared and the count of the number of times the
    word appeared that year. 
    """
    dictionary = {}
    for line in fileName:
        new = line.split(', ') 
        id = new[0]
        yc = createYearCount(int(new[1]), int(new[2]))
        # add to list or create a new list
        if not id in dictionary:
            dictionary[id] = [yc]
        else:
            dictionary[id].append(yc)
    return dictionary

def totalOccurences(word, words):
    """
    Output: The total number of times that a word has appeared
    in a book in the entire dataset.
    return; count(total amount of times a word has appeared)
    param; word(the word for which to calculate the count)
           words(A dictionary mapping words to lists of YearCount objects)
    """
    if word not in words:
        return 0
    count = 0
    for item in words[word]:
        count += item.count
    return count

def main():
    fileName = input('Enter filename: ')
    readWordFile(open(fileName))
    word = input('Enter word to search for: ')
    print(totalOccurences(word, readWordFile(fileName)))

main()

文本文件：

airport, 2007, 175702
airport, 2008, 173294
request, 2005, 646179
request, 2006, 677820
request, 2007, 697645
request, 2008, 795265
wandered, 2005, 83769
wandered, 2006, 87688
wandered, 2007, 108634
wandered, 2008, 171015

有哪些更简单的方法来测试我的程序？我不断得到列表索引超出范围错误。

Answer 1

你得到0，因为你打开它后读了一次文件 - 它会将文件指针移到文件的末尾，下次你使用readWordFile（fileName）函数时，它会启动从文件的末尾 - 所以它不会找到/读取任何东西。

您可以使用变量并稍后重复使用，或者只删除该函数的第一次使用。

data = readWordFile(open(fileName))
word = input('Enter word to search for: ')
print(totalOccurences(word, data))

P.S .: 您应该在拆分后检查新变量的长度，但在使用之前：

IndexError：列表索引超出范围...字典映射

1 个答案: