我有以下结构的数据集:
Data_1 = [10,2,3,4,5,6,7,8,9];
groups_1 = [1,1,2,2,3,3,4,4,5];
Data_2 = [20,12,63,4,5,5,5,2];
groups_2 = [1,1,1,2,3,3,4,5];
boxplot(Data_1,groups_1);boxplot(Data_2,groups_2);
我想根据平均值对Data_1进行排序,并根据Data_1的平均值将彼此相邻的Data_1和Data_2分组到一个方框图中。 我的方法是使用grpstats:
Table_1 = array2table([Data_1' groups_1']);
Stats = grpstats(Table_1,'Var2');
sort_Stats = sortrows(Stats,'mean_Var1');
现在我被困在如何有效地基于grpstats结果组合两个数据集和组。
更新: 我需要以下结构作为最终目标:
Data_Corrected_1 = [3,4,5,6,10,2,7,8,9];
groups_Corrected_1 = [1,1,2,2,3,3,4,4,5];
boxplot(Data_Corrected_1,groups_Corrected_1);
% Now the box plot is arranged based on the mean
答案 0 :(得分:1)
以下是如何根据群组的方式对数据进行排序:
Data_1 = [10,2,3,4,5,6,7,8,9];
groups_1 = [1,1,2,2,3,3,4,4,5];
[~,ind] = sort(accumarray(groups_1.',Data_1,[],@mean));
ord1 = sortrows([(1:max(groups_1)).' ind],2);
% without repelem:
a = ord1(:,1);
b = histcounts(groups_1);
idx(cumsum([1 b(b>0)])) = 1;
c = a(cumsum(idx(1:find(idx,1,'last')-1)));
Data_Corrected_1 = sortrows([c groups_1.' Data_1.'],1);
% with repelem:
% Data_Corrected_1 = sortrows([repelem(ord1(:,1),histcounts(groups_1))...
% groups_1.' Data_1.'],1);
boxplot(Data_Corrected_1(:,3),Data_Corrected_1(:,1));
同样适用于Data_2
。