Question

（请务必在帖子结尾处查看编辑内容，然后再深入了解来源）

我正在绘制一个似乎有log Laplacian分布的人口直方图： enter image description here

我正在尝试绘制一条最适合它的线来验证我的假设，但我在获得有意义的结果方面遇到了问题。

我正在使用Wikipedia中的拉普拉斯PDF定义，并使用10来获得PDF的强大功能（以“逆转”日志直方图的效果）。

我做错了什么？

这是我的代码。我通过标准输入（cat pop.txt | python hist.py） - here's样本填充。

from pylab import *
import numpy    
def laplace(x, mu, b):
    return 10**(1.0/(2*b) * numpy.exp(-abs(x - mu)/b))    
def main():
    import sys
    num = map(int, sys.stdin.read().strip().split(' '))
    nbins = max(num) - min(num)
    n, bins, patches = hist(num, nbins, range=(min(num), max(num)), log=True, align='left')
    loc, scale = 0., 1.
    x = numpy.arange(bins[0], bins[-1], 1.)
    pdf = laplace(x, 0., 1.)
    plot(x, pdf)
    width = max(-min(num), max(num))
    xlim((-width, width))
    ylim((1.0, 10**7))
    show()
if __name__ == '__main__':
    main()

修改

好的，这是尝试将其与常规拉普拉斯分布（与对数拉普拉斯算子相对）相匹配。与上述尝试的差异：

直方图是标准的
直方图是线性的（不是日志）
laplace函数

输出： enter image description here

正如你所看到的，它不是最佳匹配，但数字（直方图和拉普拉斯PDF）至少现在在同一个球场。我认为日志拉普拉斯会更好地匹配。我的方法（上面的源代码）不起作用。任何人都可以建议一种方法吗？

来源：

from pylab import *
import numpy   
def laplace(x, mu, b):
    return 1.0/(2*b) * numpy.exp(-abs(x - mu)/b)
def main():
    import sys
    num = map(int, sys.stdin.read().strip().split(' '))
    nbins = max(num) - min(num)
    n, bins, patches = hist(num, nbins, range=(min(num), max(num)), log=False, align='left', normed=True)
    loc, scale = 0., 0.54
    x = numpy.arange(bins[0], bins[-1], 1.)
    pdf = laplace(x, loc, scale)
    plot(x, pdf)
    width = max(-min(num), max(num))
    xlim((-width, width))
        show()
if __name__ == '__main__':
    main()

Answer 1

你的laplace（）函数似乎不是拉普拉斯分布。此外，numpy.log()是一个自然对数（基数e），而不是十进制。
您的直方图似乎没有标准化，而分布是。

编辑：

不要使用一揽子进口from pyplot import *，它会咬你。
如果您正在检查拉普拉斯分布（或其日志）的一致性，请使用后者围绕mu对称的事实：在最大直方图处修复mu，并且你有一个单参数问题。而且您也只能使用直方图的一半。
使用numpy的直方图功能 - 这样您就可以获得直方图本身，然后可以使用拉普拉斯分布（和/或其日志）。卡方将告诉你一致性有多好（或坏）。为了装配你可以使用，例如， scipy.optimize.leastsq例程（http://www.scipy.org/Cookbook/FittingData）

Answer 2

我找到了解决我遇到的问题的方法。我没有使用matplotlib.hist，而是将numpy.histogram与matplotlib.bar结合使用来计算直方图并在两个单独的步骤中绘制。

我不确定是否有办法使用matplotlib.hist执行此操作 - 但这肯定会更方便。 enter image description here

你可以看到它是一个更好的匹配。

我现在的问题是我需要估算PDF的scale参数。

来源：

from pylab import *
import numpy

def laplace(x, mu, b):
    """http://en.wikipedia.org/wiki/Laplace_distribution"""
    return 1.0/(2*b) * numpy.exp(-abs(x - mu)/b)

def main():
    import sys
    num = map(int, sys.stdin.read().strip().split(' '))
    nbins = max(num) - min(num)
    count, bins = numpy.histogram(num, nbins)
    bins = bins[:-1]
    assert len(bins) == nbins
    #
    # FIRST we take the log of the histogram, THEN we normalize it.
    # Clean up after divide by zero
    #
    count = numpy.log(count)
    for i in range(nbins):
        if count[i] == -numpy.inf:
            count[i] = 0
    count = count/max(count)

    loc = 0.
    scale = 4.
    x = numpy.arange(bins[0], bins[-1], 1.)
    pdf = laplace(x, loc, scale)
    pdf = pdf/max(pdf)

    width=1.0
    bar(bins-width/2, count, width=width)
    plot(x, pdf, color='r')
    xlim(min(num), max(num))
    show()

if __name__ == '__main__':
    main()

Matplotlib直方图与日志拉普拉斯PDF

2 个答案: