在Mathematica中使用BinCounts或直方图的FindFit

时间:2011-08-24 18:33:21

标签: wolfram-mathematica histogram bin

daList={62.8347, 88.5806, 74.8825, 61.1739, 66.1062, 42.4912, 62.7023, 
        39.0254, 48.3332, 48.5521, 51.5432, 69.4951, 60.0677, 48.4408, 
        59.273, 30.0093, 94.6293, 43.904, 59.6066, 58.7394, 68.6183, 83.0942, 
        73.1526, 47.7382, 75.6227, 58.7549, 59.2727, 26.7627, 89.493, 
        49.3775, 79.9154, 73.2187, 49.5929, 84.4546, 28.3952, 75.7541, 
        72.5095, 60.5712, 53.2651, 33.5062, 80.4114, 63.7094, 90.2438, 
        55.2248, 44.437, 28.1884, 4.77477, 36.8398, 70.3579, 28.1913, 
        43.9001, 23.8907, 12.7823, 22.3473, 57.6724, 49.0148}

以上是我正在处理的实际数据样本。 我使用BinCounts,但这只是为了说明直观图应该这样做:我想适合直方图的形状

Histogram@data

enter image description here

我知道如何自己适合数据点:

model = 0.2659615202676218` E^(-0.2222222222222222` (x - \[Mu])^2)
FindFit[data, model, \[Mu], x]

这远非我想做的事情:我如何在Mathematica中拟合bin计数/直方图?

1 个答案:

答案 0 :(得分:19)

如果你有MMA V8,你可以使用新的DistributionFitTest

disFitObj = DistributionFitTest[daList, NormalDistribution[a, b],"HypothesisTestData"];

Show[
   SmoothHistogram[daList], 
   Plot[PDF[disFitObj["FittedDistribution"], x], {x, 0, 120}, 
        PlotStyle -> Red
   ], 
   PlotRange -> All
]

enter image description here

disFitObj["FittedDistributionParameters"]

(* ==> {a -> 55.8115, b -> 20.3259} *)

disFitObj["FittedDistribution"]

(* ==> NormalDistribution[55.8115, 20.3259] *)

它也可以适合其他发行版。


另一个有用的V8功能是HistogramList,它为您提供Histogram的分箱数据。它也需要所有Histogram个选项。

{bins, counts} = HistogramList[daList]

(* ==> {{0, 20, 40, 60, 80, 100}, {2, 10, 20, 17, 7}} *)

centers = MovingAverage[bins, 2]

(* ==> {10, 30, 50, 70, 90} *)

model = s E^(-((x - \[Mu])^2/\[Sigma]^2));

pars = FindFit[{centers, counts}\[Transpose], 
                     model, {{\[Mu], 50}, {s, 20}, {\[Sigma], 10}}, x]

(* ==> {\[Mu] -> 56.7075, s -> 20.7153, \[Sigma] -> 31.3521} *)

Show[Histogram[daList],Plot[model /. pars // Evaluate, {x, 0, 120}]]

enter image description here

你也可以试试NonlinearModeFit。在这两种情况下,最好使用您自己的初始参数值,以最大限度地获得全局最佳拟合。


在V7中没有HistogramList,但您可以使用this获得相同的列表:

  

直方图[data,bspec,fh]中的函数fh应用于两个   参数:一个列表{{下标[b,1],下标[b,   2]},{下标[b,2],下标[b,3]},[省略号]}和相应的   计数列表{下标[c,1],下标[c,2],[省略号]}。该   函数应该返回一个高度列表,用于每个   下标[c,i]。

这可以使用如下(from my earlier answer):

Reap[Histogram[daList, Automatic, (Sow[{#1, #2}]; #2) &]][[2]]

(* ==> {{{{{0, 20}, {20, 40}, {40, 60}, {60, 80}, {80, 100}}, {2, 
    10, 20, 17, 7}}}} *)

当然,您仍然可以使用BinCounts但是您错过了MMA的自动分级算法。您必须提供自己的分箱:

counts = BinCounts[daList, {0, Ceiling[Max[daList], 10], 10}]

(* ==>  {1, 1, 6, 4, 11, 9, 9, 8, 5, 2} *)

centers = Table[c + 5, {c, 0, Ceiling[Max[daList] - 10, 10], 10}]

(* ==>  {5, 15, 25, 35, 45, 55, 65, 75, 85, 95} *)

pars = FindFit[{centers, counts}\[Transpose],
                model, {{\[Mu], 50}, {s, 20}, {\[Sigma], 10}}, x]

(* ==> \[Mu] -> 56.6575, s -> 10.0184, \[Sigma] -> 32.8779} *)

Show[
   Histogram[daList, {0, Ceiling[Max[daList], 10], 10}], 
   Plot[model /. pars // Evaluate, {x, 0, 120}]
]

enter image description here

正如您所看到的,拟合参数可能在很大程度上取决于您的分箱选择。特别是我称之为s的参数关键取决于箱的数量。箱数越多,单个箱柜的数量越低,s的值就越低。