背景:以下代码可用于执行双字母分析的玩具示例:
import nltk
from nltk import bigrams
from nltk.tokenize import word_tokenize
text = "some nice words go here"
tokens = word_tokenize(text)
bi_tokens = bigrams(tokens)
bi_count = {}
for token in bi_tokens:
if token not in bi_count:
bi_count[token] = 1
else:
bi_count[token] += 1
输出:
print(bi_count)
{('go', 'here'): 1,
('nice', 'words'): 1,
('some', 'nice'): 1,
('words', 'go'): 1}
问题:我想使用key
名称(例如('go', 'here')
)来获取相应的value
(例如1
)。
我尝试过搜索http://www.nltk.org/api/nltk.html?highlight=freqdist和How to access specific element of dictionary of tuples,但我找不到答案。
问题:有没有办法通过nltk
方法或任何其他方式解决我的问题?
答案 0 :(得分:1)
>>> from collections import Counter
>>> from nltk import bigrams, word_tokenize
>>> text = "some nice words go here"
# Count no. of ngrams
>>> bigram_counter = Counter(bigrams(word_tokenize(text)))
# Iterate through the ngrams and their counts.
>>> for bg, count in bigram_counter.most_common():
... print(bg, count)
...
('some', 'nice') 1
('go', 'here') 1
('words', 'go') 1
('nice', 'words') 1
<强>答案:强>
# Access the Counter object.
>>> bigram_counter[('some', 'nice')]
1
>>> bigram_counter[('words', 'go')]
1
看看
答案 1 :(得分:0)
search_key = ('go', 'here')
for key, value in bi_count.items():
if key == search_key:
print(value) #1