在MATLAB中有没有办法检查直方图分布是单峰还是双峰?
修改
您认为Hartigan's Dip Statistic会起作用吗?我尝试将图像传递给它,并获得值0
。这是什么意思?
并且,当传递图像时,它是否测试灰度级图像直方图的分布?
感谢。
答案 0 :(得分:5)
这是一个使用Nic Price实施Hartigan's Dip测试来识别单峰分布的脚本。棘手的一点是计算xpdf
,这不是概率密度函数,而是一个有序的样本。
p_value
是获得测试统计量的概率,至少与实际观察到的一样极端,假设零假设为真。在这种情况下,零假设是分布是单峰的。
close all; clear all;
function [x2, n, b] = compute_xpdf(x)
x2 = reshape(x, 1, prod(size(x)));
[n, b] = hist(x2, 40);
% This is definitely not probability density function
x2 = sort(x2);
% downsampling to speed up computations
x2 = interp1 (1:length(x2), x2, 1:1000:length(x2));
end
nboot = 500;
sample_size = [256 256];
% Unimodal
sample2d = normrnd(0.0, 10.0, sample_size);
[xpdf, n, b] = compute_xpdf(sample2d);
[dip, p_value, xlow, xup] = HartigansDipSignifTest(xpdf, nboot);
figure;
subplot(1,2,1);
bar(n, b)
title(sprintf('Probability of unimodal %.2f', p_value))
% Bimodal
sample2d = sign(sample2d) .* (abs(sample2d) .^ 0.5);
[xpdf, n, b] = compute_xpdf(sample2d);
[dip, p_value, xlow, xup] = HartigansDipSignifTest(xpdf, nboot);
subplot(1,2,2);
bar(n, b)
title(sprintf('Probability of unimodal %.2f', p_value))
print -dpng modality.png
答案 1 :(得分:1)
有许多不同的方法可以满足您的要求。从字面意义上讲,“双峰”意味着有两个峰。通常,您希望将“两个峰值”分开一些合理的距离,并且您希望它们各自包含总计数的合理比例。只有你知道什么是“合理”的情况,但以下方法可能会有所帮助。
cumsum
您必须决定该数量的大小代表“双峰”。这是一些代码,演示了我在说什么。它产生不同严重程度的双峰分布 - 两个高斯,它们之间的增量增加(步长=标准偏差的大小)。我计算了上面描述的数量,并将其绘制成一系列不同的delta
值。然后我通过该曲线拟合一条抛物线,其范围对应于整个分布的±1西格玛。正如您所看到的,当分布变得更加双峰时,会发生两件事:
您可以查看一些自己的发行版的这些数量,并确定您希望将截止值放在何处。
% test for bimodal distribution
close all
for delta = 0:10:50
a1 = randn(100,100) * 10 + 25;
a2 = randn(100,100) * 10 + 25 + delta;
a3 = [a1(:); a2(:)];
[h hb] = hist(a3, 0:100);
cs = cumsum(h);
llimi = find(cs < 0.2 * max(cs(:)));
ulimi = find(cs > 0.8 * max(cs(:)));
llim = hb(llimi(end));
ulim = hb(ulimi(1));
cuts = linspace(llim, ulim, 20);
dmean = mean(a3);
dstd = std(a3);
for ci = 1:numel(cuts)
d1 = a3(a3<cuts(ci));
d2 = a3(a3>=cuts(ci));
m(ci,1) = mean(d1);
m(ci, 2) = mean(d2);
s(ci, 1) = std(d1);
s(ci, 2) = std(d2);
end
q = (m(:, 2) - m(:, 1)) ./ sum(s, 2);
figure;
plot(cuts, q);
title(sprintf('delta = %d', delta))
% compute curvature of plot around mean:
xlims = dmean + [-1 1] * dstd;
indx = find(cuts < xlims(2) && cuts > xlims(1));
pf = polyfit(cuts(indx), q(indx), 2);
m = polyval(pf, dmean);
fprintf(1, 'coefficients: a = %.2e, peak = %.2f\n', pf(1), m);
end
输出值:
coefficients: a = 1.37e-03, peak = 1.32
coefficients: a = 1.01e-03, peak = 1.34
coefficients: a = 2.85e-04, peak = 1.45
coefficients: a = -5.78e-04, peak = 1.70
coefficients: a = -1.29e-03, peak = 2.08
coefficients: a = -1.58e-03, peak = 2.48
示例图:
delta = 40的直方图: