Question

背景：以下代码可用于执行双字母分析的玩具示例：

import nltk
from nltk import bigrams
from nltk.tokenize import word_tokenize

text = "some nice words go here"
tokens = word_tokenize(text)
bi_tokens = bigrams(tokens)

bi_count = {}
for token in bi_tokens:
    if token not in bi_count:
        bi_count[token] = 1
    else:
        bi_count[token] += 1

输出：

 print(bi_count)

 {('go', 'here'): 1,
 ('nice', 'words'): 1,
 ('some', 'nice'): 1,
 ('words', 'go'): 1}

问题：我想使用key名称（例如('go', 'here')）来获取相应的value（例如1）。

我尝试过搜索http://www.nltk.org/api/nltk.html?highlight=freqdist和How to access specific element of dictionary of tuples，但我找不到答案。

问题：有没有办法通过nltk方法或任何其他方式解决我的问题？

Answer 1

>>> from collections import Counter
>>> from nltk import bigrams, word_tokenize
>>> text = "some nice words go here"

# Count no. of ngrams
>>> bigram_counter = Counter(bigrams(word_tokenize(text)))

# Iterate through the ngrams and their counts.
>>> for bg, count in bigram_counter.most_common():
...     print(bg, count)
... 
('some', 'nice') 1
('go', 'here') 1
('words', 'go') 1
('nice', 'words') 1

<强>答案：

# Access the Counter object. 
>>> bigram_counter[('some', 'nice')]
1
>>> bigram_counter[('words', 'go')]
1

看看

Answer 2

search_key = ('go', 'here')
for key, value in bi_count.items(): 
    if key == search_key:
        print(value) #1

如何访问bigram计数器字典中的密钥？

2 个答案: