Question

在课堂上，我必须完成一个代码。它采用了一系列令牌，并且应该提供一个键的词典（来自使用nltk.bigrams（）形式的语料库中的双字母组合）以及作为值，该二元组出现的概率（基于我的二元组的频率）语料库）。我的解决方案是：

a = nltk.FreqDist(nltk.bigrams("aaaaaaacbegdeg"))

我有一个词典，但它是下面的陷阱：

FreqDist({('a', 'a'): 6,
          ('a', 'c'): 1,
          ('b', 'e'): 1,
          ('c', 'b'): 1,
          ('d', 'e'): 1,
          ('e', 'g'): 2,
          ('g', 'd'): 1})

如何取出FreqDist？最良好的问候，比安卡

Answer 1

nltk.FreqDist对象是原生collections.Counter的子类型，它是原生dict子类，请参阅Difference between Python's collections.Counter and nltk.probability.FreqDist

您只需将其强制转换回原生dict对象，如下所示：

>>> from nltk import FreqDist, bigrams
>>> a = FreqDist(bigrams("aaaaaaacbegdeg"))
>>> a
FreqDist({('a', 'a'): 6, ('e', 'g'): 2, ('d', 'e'): 1, ('c', 'b'): 1, ('b', 'e'): 1, ('a', 'c'): 1, ('g', 'd'): 1})
>>> dict(a)
{('d', 'e'): 1, ('a', 'a'): 6, ('c', 'b'): 1, ('e', 'g'): 2, ('b', 'e'): 1, ('a', 'c'): 1, ('g', 'd'): 1}
>>> b = dict(a)
>>> b
{('d', 'e'): 1, ('a', 'a'): 6, ('c', 'b'): 1, ('e', 'g'): 2, ('b', 'e'): 1, ('a', 'c'): 1, ('g', 'd'): 1}

顺便说一句，也没有必要将它转换为dict对象，因为它的行为类似于主dict函数的get()对象：

>>> a[('a', 'a')]
6
>>> b[('a', 'a')]
6

>>> a.get(('a', 'a'))
6
>>> b.get(('a', 'a'))
6

如何将FreqDist转换为字典？

1 个答案: