Question

我想使用我在维基百科上找到的Flajolet公式解决一般情况下的优惠券收集者的问题（每种优惠券的可能性各不相同）（请参阅https://en.wikipedia.org/wiki/Coupon_collector%27s_problem）。根据公式，我必须计算一个积分，其中被积是一个乘积。我正在使用scipy.integrad.quad和lambda表示法进行集成。问题在于，被积物中因子的数量不固定（参数来自列表）。当我尝试乘以积分因子时，出现错误，因为我似乎无法乘以形式表达式。但是，如果我不这样做，我不知道要输入积分变量x。

例如，只有两个因素，我找到了集成产品的方法。而且它似乎不涉及双重集成或类似的东西。谁能帮忙（我对这个东西很陌生）？

import numpy as np
from scipy import integrate
....
def compute_general_case(p_list):
    integrand = 1
    for p in p_list:
        integrand_factor = lambda x: 1 - np.exp(-p * x)
        integrand *= integrand_factor
    integrand = 1 - integrand
    erg = integrate.quad(integrand, 0, np.inf)
    print(erg)

Answer 1

您可以使用任意数量的参数定义集成函数，只要使用quad将它们传递给args=：

def integrand(x, *p_list):
    p_list = np.asarray(p_list)
    return 1 - np.product(1 - np.exp(-x * p_list))   #don't need to for-loop a product in numpy

result, abserr = quad(integrand, 0, np.inf, args=[1,1,1,1])
print(result, abserr)
>> 2.083333333333334 2.491001112400493e-10

有关更多信息，请参见here

Answer 2

感谢您解决此问题：@Mstaino的回答如下，您甚至不需要args，因为您可以通过闭包将它们传递给函数：

def coupon_collector_expected_samples(probs):
    """
    Find the expected number of samples before all "coupons" (with a
    non-uniform probability mass) are "collected".

    Args:
        probs (ndarray): probability mass for each unique item

    References:
        https://en.wikipedia.org/wiki/Coupon_collector%27s_problem
        https://www.combinatorics.org/ojs/index.php/eljc/article/view/v20i2p33/pdf
        https://stackoverflow.com/questions/54539128/scipy-integrand-is-product

    Example:
        >>> import numpy as np
        >>> import ubelt as ub
        >>> # Check EV of samples for a non-uniform distribution
        >>> probs = [0.38, 0.05, 0.36, 0.16, 0.05]
        >>> ev = coupon_collector_expected_samples(probs)
        >>> print('ev = {}'.format(ub.repr2(ev, precision=4)))
        ev = 30.6537

        >>> # Use general solution on a uniform distribution
        >>> probs = np.ones(4) / 4
        >>> ev = coupon_collector_expected_samples(probs)
        >>> print('ev = {}'.format(ub.repr2(ev, precision=4)))
        ev = 8.3333

        >>> # Check that this is the same as the solution for the uniform case
        >>> import sympy
        >>> n = len(probs)
        >>> uniform_ev = float(sympy.harmonic(n) * n)
        >>> assert np.isclose(ev, uniform_ev)
    """
    import numpy as np
    from scipy import integrate
    probs = np.asarray(probs)
    # Philippe Flajolet's generalized expected value integral
    def _integrand(t):
        return 1 - np.product(1 - np.exp(-probs * t))
    ev, abserr = integrate.quad(func=_integrand, a=0, b=np.inf)
    return ev

您还可以看到我的实现更快：

import timerit
ti = timerit.Timerit(100, bestof=10, verbose=2)

probs = np.random.rand(100)

from scipy import integrate
def orig_method(p_list):
    def integrand(x, *p_list):
        p_list = np.asarray(p_list)
        return 1 - np.product(1 - np.exp(-x * p_list))
    result, abserr = integrate.quad(integrand, 0, np.inf, args=p_list)
    return result

ti = timerit.Timerit(100, bestof=10, verbose=2)
for timer in ti.reset('orig_implementation'):
    with timer:
        orig_method(probs)

for timer in ti.reset('my_implementation'):
    with timer:
        coupon_collector_expected_samples(probs)

# Results:
# Timed orig_implementation for: 100 loops, best of 10
#     time per loop: best=7.267 ms, mean=7.954 ± 0.5 ms
# Timed my_implementation for: 100 loops, best of 10
#     time per loop: best=5.618 ms, mean=5.648 ± 0.0 ms

使用scipy计算积分，其中被积是参数来自（任意长）列表的乘积

2 个答案: