FreqDist Python ......最后的问题

时间:2014-03-14 21:18:52

标签: python

当我尝试打印FreqDist对象时,我得到了" ..."在印刷的最后? 我尝试在互联网上寻找它,但无法找到。

请让我知道我哪里出错了。

代码:

for word in nltk.word_tokenize(lin):
    fdist.inc(word)

print fdist

1 个答案:

答案 0 :(得分:1)

使用fdist时,它会返回键值对列表。你必须使用循环打印出来。像下面这样的东西应该有效:

import nltk
from nltk.tokenize import word_tokenize

lin = "A frequency distribution for the outcomes of an experiment. A frequency distribution records the number of times each outcome of an experiment has occurred. For example, a frequency distribution could be used to record the frequency of each word type in a document. Formally, a frequency distribution can be defined as a function mapping from each sample to the number of times that sample occurred as an outcome."

fdist = nltk.FreqDist()

for word in word_tokenize(lin):
    fdist.inc(word)

for f in fdist:
    print f, fdist[f]

结果是:

frequency 5
of 5
a 4
distribution 4
the 4
an 3
each 3
, 2
A 2
as 2
be 2
number 2
outcome 2
sample 2
times 2
to 2
. 1
For 1
Formally 1
can 1
could 1
defined 1
document. 1
example 1
experiment 1
experiment. 1
for 1
from 1
function 1
has 1
in 1
mapping 1
occurred 1
occurred. 1
outcomes 1
record 1
records 1
that 1
type 1
used 1
word 1
[Finished in 1.5s]

如果有帮助,请告诉我们。

修改

另一种方法:

import nltk
from nltk.tokenize import word_tokenize

lin = "A frequency distribution for the outcomes of an experiment. A frequency distribution records the number of times each outcome of an experiment has occurred. For example, a frequency distribution could be used to record the frequency of each word type in a document. Formally, a frequency distribution can be defined as a function mapping from each sample to the number of times that sample occurred as an outcome."

tokens = word_tokenize(lin)
fdist = nltk.FreqDist(tokens)

for f in fdist:
    print f, fdist[f]

输出相同。