应用错误收集

我正试图用NLTK“计算”我的语料库中的双字母组合。但是，似乎我的脚本中仍然存在错误。我无法弄清楚我做错了什么，所以我希望有人能够给我至少一些线索。请记住，我对此很新。谢谢！

tekst.collocations()    
bgm = nltk.collocations.BigramAssocMeasures()
finder = BigramCollocationFinder.from_words(mijn_corpus) # mijn_corpus should be it's loc
finder.apply_freq_filter(3) # filter out the ones that only appear 1,2 times
finder.nbest(bgm.pmi, 10) 
scored_bgm = finder.score_ngrams( bgm.likelihood_ratio  )
prefix_keys = collections.defaultdict(list) 
for key, scores in scored: # sorting on first word of bigram
    prefix_keys[key[0]].append((key[1], scores))
for key in prefix_keys: #strongest association
    prefix_keys[key].sort(key = lambda x: -x[1])

带有NLTK的Bigrams：脚本问题

0 个答案: