计算整个语料库中所有剑的频率

时间:2017-10-26 18:05:26

标签: python python-3.x count word-frequency

我试图计算每个单词出现在整个语料库中的次数。
但我得到了错误:

 corpus_root = os.path.abspath('../nlp_urdu/out1_data')
    mycorpus = nltk.corpus.reader.TaggedCorpusReader(corpus_root,'.*')
    noun=[]
    count_freq = defaultdict(int)
    for infile in (mycorpus.fileids()):
        print(infile)
    for i in (mycorpus.tagged_sents()):
         texts = [word for word, pos in i  if (pos == 'NN' )]
         noun.append(texts)  
         count_freq[noun]+= 1
         print(count_freq)
我得到的错误是:

  

count_freq [名词] + = 1

     

TypeError:不可用类型:'list'

1 个答案:

答案 0 :(得分:0)

textsnoun的列表
count_freq是一个词典,每个键必须noun(a string

corpus_root = os.path.abspath('../nlp_urdu/out1_data')
    mycorpus = nltk.corpus.reader.TaggedCorpusReader(corpus_root,'.*')
    count_freq = defaultdict(int)
    for infile in (mycorpus.fileids()):
        print(infile)
    for i in (mycorpus.tagged_sents()):
         texts = [word for word, pos in i  if (pos == 'NN' )]
         for noun in texts :             
             count_freq[noun]+= 1

    print(count_freq)