python scipy / numpy中的多项pmf

时间:2012-12-16 17:51:19

标签: python numpy scipy probability scientific-computing

scipy / numpy中是否有内置函数来获取多项式的PMF?我不确定binom是否以正确的方式概括,例如

# Attempt to define multinomial with n = 10, p = [0.1, 0.1, 0.8]
rv = scipy.stats.binom(10, [0.1, 0.1, 0.8])
# Score the outcome 4, 4, 2
rv.pmf([4, 4, 2])

这样做的正确方法是什么?感谢。

1 个答案:

答案 0 :(得分:9)

我所知道没有内置函数,并且二项式概率没有概括(你需要对不同的可能结果进行归一化,因为所有计数的总和必须是n,不会被采用由独立二项式照顾)。但是,实现自己相当简单,例如:

import math

class Multinomial(object):
  def __init__(self, params):
    self._params = params

  def pmf(self, counts):
    if not(len(counts)==len(self._params)):
      raise ValueError("Dimensionality of count vector is incorrect")

    prob = 1.
    for i,c in enumerate(counts):
      prob *= self._params[i]**counts[i]

    return prob * math.exp(self._log_multinomial_coeff(counts))

  def log_pmf(self,counts):
    if not(len(counts)==len(self._params)):
      raise ValueError("Dimensionality of count vector is incorrect")

    prob = 0.
    for i,c in enumerate(counts):
      prob += counts[i]*math.log(self._params[i])

    return prob + self._log_multinomial_coeff(counts)

  def _log_multinomial_coeff(self, counts):
    return self._log_factorial(sum(counts)) - sum(self._log_factorial(c)
                                                    for c in counts)

  def _log_factorial(self, num):
    if not round(num)==num and num > 0:
      raise ValueError("Can only compute the factorial of positive ints")
    return sum(math.log(n) for n in range(1,num+1))

m = Multinomial([0.1, 0.1, 0.8])
print m.pmf([4,4,2])

>>2.016e-05

我对多项系数的实现有点天真,并且在日志空间中工作以防止溢出。还要注意n作为参数是多余的,因为它是由计数的总和给出的(并且相同的参数集适用于任何n)。此外,由于这会对中等大小或大维度快速下溢,因此您最好在日志空间中工作(此处也提供logPMF!)