我有一个场景,其中有一个尺寸为N x 1的Label矩阵。 标签矩阵中的示例条目如下所示
Label = [1; 3; 5; ....... 6]
我想
LabelIndicatorMatrix = [1; 1; 0;.....1]
1表示已选择记录,0表示在采样期间未选择的记录。输出矩阵满足以下条件
Sum(LabelIndicatorMatrix) = m1+m2...m6
答案 0 :(得分:2)
一种可能的解决方案:
Label = randi([1 6], [100 1]); %# random Nx1 vector of labels
m = [2 3 1 0 1 2]; %# number of records to sample from each category
LabelIndicatorMatrix = false(size(Label)); %# marks selected records
uniqL = unique(Label); %# unique labels: 1,2,3,4,5,6
for i=1:numel(uniqL)
idx = find(Label == uniqL(i)); %# indices where label==k
ord = randperm(length(idx)); %# random permutation
ord = ord(1:min(m(i),end)); %# pick first m_k
LabelIndicatorMatrix( idx(ord) ) = true; %# mark them as selected
end
为了确保我们满足要求,我们检查:
>> sum(LabelIndicatorMatrix) == sum(m)
ans =
1
以下是我对矢量化解决方案的尝试:
Label = randi([1 6], [100 1]); %# random Nx1 vector of labels
m = [2 3 1 0 1 2]; %# number of records to sample from each category
%# some helper functions
firstN = @(V,n) V(1:min(n,end)); %# first n elements from vector
pickN = @(V,n) firstN(V(randperm(length(V))), n); %# pick n elements from vector
%# randomly sample labels, and get indices
idx = bsxfun(@eq, Label, unique(Label)'); %'# idx(:,k) indicates where label==k
[r c] = find(idx); %# row/column indices
idx = arrayfun(@(k) pickN(r(c==k),m(k)), 1:size(idx,2), ...
'UniformOutput',false); %# sample m(k) from labels==k
%# mark selected records
LabelIndicatorMatrix = false(size(Label));
LabelIndicatorMatrix( vertcat(idx{:}) ) = true;
%# check results are correct
assert( sum(LabelIndicatorMatrix)==sum(m) )
答案 1 :(得分:1)
你可以从这个小代码示例开始,它选择标签向量的随机样本,并找到至少选择过一次标签向量的值:
Label = [1; 3; 5; ....... 6];
index = randi(N,m1,1);
index = unique(index);
LabelIndicatorMatrix = zeros(N,1);
LabelIndicatorMatrix(index)=1;
那说我不确定我是否理解LabelIndicatorMatrix的最终条件。