Question

对于任务，我使用ConidProbDist，使用LidstoneProbDist作为估算器，将+0.01添加到每个bin的样本计数中。

我认为以下代码行可以实现此目的，但它会产生值错误

fd = nltk.ConditionalProbDist(fd,nltk.probability.LidstoneProbDist,0.01)

我不确定如何格式化ConditionalProbDist中的参数，并且没有太多运气通过python的帮助功能或谷歌找到如何做到这一点，所以如果任何人都可以设置我，那将非常感谢！

Answer 1

你可能不再需要这个了，因为问题已经很久了，但是你仍然可以在lambda的帮助下将LidstoneProbDist参数传递给ConditionalProbDist：

estimator = lambda fdist, bins: nltk.LidstoneProbDist(fdist, 0.01, bins)
cpd = nltk.ConditionalProbDist(fd, estimator, bins)

Answer 2

我在NLTK网站上发现the probability tutorial非常有用作为参考。

正如上面的答案中所提到的，使用lambda表达式是一个好主意，因为ConditionalProbDist将动态生成频率分布（nltk.FreqDist），并传递给估算器。

更微妙的一点是，如果您不知道输入样本中最初有多少个bin，则无法通过bin参数进行操作！但是，FreqDist的可用分区数为FreqDist.B()（docs）。

而是使用FreqDist作为lambda的唯一参数：

from nltk.probability import *
# ...

# Using the given parameters of one extra bin and a gamma of 0.01
lidstone_estimator = lambda fd: LidstoneProbDist(fd, 0.01, fd.B() + 1)
conditional_pd = ConditionalProbDist(conditional_fd, lidstone_estimator)

我知道这个问题现在已经很老了，但我也很难找到文档，所以我在这里记录它，以防其他人在线上遇到类似的问题。

祝你好运（使用fnlp）！

Python：NLTK ValueError：Lidstone概率分布必须至少有一个bin？

2 个答案: