我想计算一个数组中每个单词在文本文件中的次数。如果在shell中编写print语句,我将获得输出。但是,当我将其作为文件运行时。我收到错误" IndexError:列表索引超出范围"。我是python的初学者,请帮帮我。
from collections import Counter
from array import *
import string
cnt=Counter()
file = open('output.txt', 'r')
word =[ ]
c=[ ]
count =0
first_word =[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
word_count = [ ]
new_array =['CC','CD','DT','EX','FW','IN','JJ','JJR','JJS','LS','MD','NN','NNS','NNP','NNPS','PDT',
'POS','PRP','PRP$','RB','RBR','RBS','RP','SYM','TO','UH','VB','VBD','VBZ','WDT','WP$','WP','WRB']
for line in file:
words = line.split()
word.append(words)
for i in range(0,30):
for j in range(0,33):
if(new_array[j] in word[i][0]):
first_word[j]+=1
else:
continue
print first_word
答案 0 :(得分:0)
当您不想迭代range
时,不要对lists
使用显式值,而是使用列表长度迭代。这样就不会有索引错误。所以,替换:
for i in range(0,30):
for j in range(0,33):
使用:
for i in range(len(word)):
for j in range(len(first_word)):
我想它会解决这个问题。此外,当您必须初始化具有类似值的列表时:
first_word =[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
在python中有一个简单的方法:
>>> first_word = [0]*33
>>> first_word
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
答案 1 :(得分:0)
我认为以下代码可以为您提供所需的结果:
wordsFromFile = []
f = open("output.txt", 'r')
for each_line in f:
wordsFromFile.extend(each_line.strip().split(" "))
f.close()
print wordsFromFile
new_array = ['CC','CD','DT','EX','FW','IN','JJ','JJR','JJS','LS','MD','NN','NNS','NNP','NNPS','PDT',
'POS','PRP','PRP$','RB','RBR','RBS','RP','SYM','TO','UH','VB','VBD','VBZ','WDT','WP$','WP','WRB']
first_word = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
for eachWordFromFile in wordsFromFile:
if eachWordFromFile in new_array:
first_word[new_array.index(eachWordFromFile)] += 1
#output results:
for i in range(0,33):
print str(new_array[i]) + ": " + str(first_word[i])