Question

假设A是一个200项单元格数组，包含4个不同的字符串（每个字符串有50个重复）。 B是一个带有一些整数的200项向量。

我正在使用[cellNos cellStartInd enumCells ] = unique(A)并获取A中的哪个项目等于其中一个唯一字符串（enumCells是一个包含整数1-4的数组，类似于枚举字符串）。

我想使用此信息从B创建一个4x50的值矩阵，以便每列都具有特定唯一字符串的值。换句话说，我想将B重新整形为一个矩阵，根据A中的每个唯一字符串排列列。

Answer 1

假设您已经知道将会有多少次重复，并且所有字符串以相同的频率重复，您可以执行以下操作：

%# sort to find where the entries occur (remember: sort does stable sorting)
[~,sortIdx] = sort(enumCells);

%# preassign the output to 50-by-4 for easy linear indexing
newB = zeros(50,4);

%# fill in values from B: first the 50 ones, then the 50 2's etc
newB(:) = B(sortIdx);

%# transpose to get a 4-by-50 array
newB = newB';

或者，以更紧凑的方式（感谢@Rich C）

[~,sortIdx] = sort(enumCells);
newB = reshape(B(sortIdx),50,4)';

Answer 2

对于您有N个不同字符串并且每个字符串出现次数M_i不同的一般情况，B中每个相应的值集都会有所不同长度，您将无法将这些集合连接成一个数字数组。您需要将这些集存储在N - 元素cell array中，并且可以使用UNIQUE和ACCUMARRAY函数执行此操作：

>> A = {'a' 'b' 'b' 'c' 'a' 'a' 'a' 'c' 'd' 'b'};  %# Sample array A
>> B = 1:10;                                       %# Sample array B
>> [uniqueStrings,~,index] = unique(A)
>> associatedValues = accumarray(index(:),B,[],@(x) {x})

associatedValues = 

    [4x1 double]    %# The values 1, 5, 6, and 7
    [3x1 double]    %# The values 2, 3, and 10
    [2x1 double]    %# The values 4 and 8
    [         9]    %# The value 9

在特定情况下，每个字符串出现的次数相同，上面的代码仍然可以正常工作，您可以选择将单元格数组的输出转换为所需的数字数组，如下所示：

associatedValues = [associatedValues{:}];

注意：由于无法保证ACCUMARRAY能够保持其累积的项目的相对顺序，因此associatedValues单元格中的项目顺序可能与它们中的相对顺序不匹配向量B。确保维持B中原始相对顺序的一种方法是修改对ACCUMARRAY的调用，如下所示：

 associatedValues = accumarray(index(:),1:numel(B),[],@(x) {B(sort(x))});

或者您可以将输入排序到ACCUMARRAY以获得相同的效果：

[index,sortIndex] = sort(index);
associatedValues = accumarray(index(:),B(sortIndex),[],@(x) {x});

Answer 3

如果我理解你的问题，可以使用find功能完成。 http://www.mathworks.com/help/techdoc/ref/find.html

要创建矩阵，只需写下：

M(:,1) = B(find(enumCells==1));
M(:,2) = B(find(enumCells==2));
M(:,3) = B(find(enumCells==3));
M(:,4) = B(find(enumCells==4));

这可能是一种更优雅的方式，但这应该有效。

编辑：您可以尝试使用“排序”来执行此操作。 sort函数可以将排序的排列作为输出。尝试：

[s perm] = sort(enumCells);
M = reshape(B(perm),50,4);

Answer 4

如果每个字符串的条目数相同，则此方法将起作用，如果它们不同，请参阅@gnovice solution。

NumStrings = numel(CellNos);
M = zeros(size(B,1)/NumStrings,NumStrings);
for i = 1:NumStrings
    M(:,i) = B(strcmp(B,CellNos{i}));
end

此外，如果你提前知道了哪些独特的字符串（即CellNos），这可以让你跳过相对昂贵的独特电话。

使用unique（）在MATLAB中重构向量

4 个答案: