我的目标是拥有一个列表列表,其中外部列表中的每个项目都包含第一个索引中的单词,以及它在第二个索引中遇到它的次数。例如,它应该如下所示:
[["test1",0],["test2",4],["test3",8]]
唯一的问题是,当我尝试从第一个内部列表中访问单词“test1”时,我得到一个索引超出范围错误。以下是我尝试执行此操作的代码:
stemmedList = [[]]
f = open(a_document_name, 'r')
#read each line of file
fileLines = f.readlines()
for fileLine in fileLines:
#here we end up with stopList, a list of words
thisReview = Hw1.read_line(fileLine)['text']
tokenList = Hw1.tokenize(thisReview)
stopList = Hw1.stopword(tokenList)
#for each word in stoplist, compare to all terms in return list to
#see if it exists, if it does add one to its second parameter, else
#add it to the list as ["word", 0]
for word in stopList:
#if list not empty
if not len(unStemmedList) == 1: #for some reason I have to do this to see if list is empty, I'm assuming when it's empty it returns a length of 1 since I'm initializing it as a list of lists??
print "List not empty."
for innerList in unStemmedList:
if innerList[0] == word:
print "Adding 1 to [" + word + ", " + str(innerList[1]) + "]"
innerList[1] = (innerList[1] + 1)
else:
print "Adding [" + word + ", 0]"
unStemmedList.append([word, 0])
else:
print "List empty."
unStemmedList.append([word, 0])
print unStemmedList[len(unStemmedList)-1]
return stemmedList
最终输出最终为:
列表为空。 [ “测试1”,0] 列表不为空“
崩溃,列表索引超出范围错误,指向行if innerList[0] == word
答案 0 :(得分:0)
假设stemmedList
和unStemmedList
相似
stemmedList = [[]]
您的列表列表中有一个空列表,它没有[0]
。而只是将其初始化为:
stemmedList = []
答案 1 :(得分:0)
您有a = [[]]
现在,当您在遇到第一个单词后附加到此列表时,您有
a = [ [], ['test', 0] ]
在下一次迭代中,您正在访问,不存在的空列表的第0个元素。
答案 2 :(得分:0)
这不是更简单吗?
counts = dict()
def plus1(key):
if key in counts:
counts[key] += 1
else:
counts[key] = 1
stoplist = "t1 t2 t1 t3 t1 t1 t2".split()
for word in stoplist:
plus1(word)
counts
{'t2': 2, 't3': 1, 't1': 4}