Question

这是我的问题：

我想知道在矢量数组的间隔中重复的值是多少次，我知道有很多人会告诉我使用＆＃34; hist＆＃34;但是我做了它并且结果不够准确，让我在图片中向您展示我的问题：

enter image description here

在过去的图片中，您可以看到蓝色的＆＃34;数据＆＃34 ;;我使用了3种值：1st＆＃34; Mode＆＃34;，2nd＆＃34; Mean＆＃34;最后＆＃34;直方图中的大多数重复值＆＃34;这意味着我使用类似[a，b] = hist（数据），然后Mayor Value = b（a == max（a））并且非常重要，不要使用预定义范围 ;但是这张图片并不代表最重复的值，所以让我给你看另一张图片，它是对数据的更近视图：

enter image description here

蓝色＆＃34;数据＆＃34;，在（0-0.5）E-5之间变化大约是我需要获得的间隔，但正如您所看到的，其他三个值不够接近。和＆＃34;模式＆＃34;价值只是＆＃34; 0＆＃34;。我希望你可以帮助我解决这个问题，谢谢顺便说一句！。

好的更清楚，我添加了这张新照片：

enter image description here

我正在寻找的是获得间隔，就像在这个例子中我手动编写的0.1 - 0.4 E-4（紫色），所以函数会说：< / p>

[A，B] = magicfunction（数据）;

A = [0.1E-4 0.4E-4]; B = [123];

其中 B = 123 表示该时间间隔中包含的数据量，正如您所看到的，我只是进入矢量＆＃34;数据＆＃34;，没有别的。

在下一个链接中，您可以获得＆＃34;数据＆＃34;： https://drive.google.com/file/d/0B4WGV21GqSL5Vk0tRUdLNk5XVnc/edit?usp=sharing

Answer 1

是不是在您想要的范围内取得最大值？你几乎得到它，你只是没有很好地定义垃圾箱。例如：

 range=4750:5050;
 [counts val]=hist(data(range),unique(data(range)));
 most_repeated _value_in_range=val(counts==max(counts));

编辑：

在澄清之后，你想要的是关于它周围的直方图宽度的统计界限（最常见的值），这里是一个解决方案：

[c, v]=hist(data,linspace(min(data),max(data),num_of_bins));
range=find(c>1/exp(1)*max(c)); % can be also c>0.5*max(c) etc...
A=[v(range(1)) v(range(end))];
B=sum(c(range));

让我们测试一些假数据：

t=linspace(-50,50,1e3);
data=0.3*exp(-(t-30).^2)+0.2*exp(-(t-10).^2)+0.3*exp(-(t+10).^2)+0.01*randn(1,numel(t));

[c, v]=hist(data,linspace(min(data),max(data),numel(t)));
range=find(c>1/exp(1)*max(c));
A=[v(range(1)) v(range(end))];
B=sum(c(range));

plot(t,data,'b'); hold on
plot([min(t) max(t)],[A(1) A(1)] ,'--r');
plot([min(t) max(t)],[A(2) A(2)] ,'--r');
B

enter image description here

B =

   518

当然你可以改变＆＃34; width＆＃34;的定义。直方图中，我取1 / e到1 / e，你可以采用半高全宽（c>0.5*max(c)），或根据使用的数据类型等缩小...

Answer 2

以下功能是基于以下几个假设设计的：

＆＃34; interval＆＃34;感兴趣的是接近0.
大部分样本都很小。

基本思想是首先筛选出太大的样本，然后根据剩余样本的排序数组定义间隔。

function [A, B] = magicfunction(data)

% Assuming the outlier samples only exist in the positive side, some 
% samples of big, positive values can be excluded in order to obtain a 
% better estimation of "the interval". Here we exclude the
% samples that are greater than mean(A)+K1*std(A), where K1 is empirically
% selected as 1.0
K1 = 1.0;
filtered_data = data( data < mean(data)+K1*std(data)); 
sorted_data = sort(filtered_data);

% Define the interval in terms of the percentile in the
% sorted_data. Here the interval is empirically selected as [0, 0.75]
interval = [0 0.75];

% Map the percentile interval to the actual index in sorted_data.
% Note that interval_index(1) cannot be smaller than 1, and
% interval_index(2) cannot be greater than length(sorted_data)
interval_index = round( length(sorted_data)*interval );
interval_index(1) = max(1, interval_index(1));
interval_index(2) = min(length(sorted_data), interval_index(2));

% Assign output A in terms of the value in the sorted_data
A = sorted_data(interval_index)

% Assign output B
B = sum( data>A(1) & data<A(2) )

% Visualization
x = [1:length(data)];
figure;
subplot(211);
    plot(x, data, ...
         x, repmat(A(:)', length(data),1) ); grid on;
    legend('data', 'lower bound', 'upper bound');
    xlim([1 20000]);
subplot(212);
    plot(x, data, ...
         x, repmat(A(:)', length(data),1) ); grid on;
    legend('data', 'lower bound', 'upper bound');
    ylim([0, 3*10^-5]);
    xlim([1 20000]);

将问题中提供的数据提供给函数会产生以下图： enter image description here

您可能需要凭经验调整函数中的两个变量以获得所需的结果。

K1
interval

如何将最重复的值确定为矢量数组matlab的区间

2 个答案:

编辑：