Question

我有一个dataset对象ds，其“变量”（又名“列”）中包含A，B和C。< / p>

对于r中的每个ds“行”（又名“观察”），我可以根据A，B的值构建“签名”， C r（例如，作为单元格数组）。

通常，ds的多个“行”可以具有相同的签名。因此，现在这个签名不能被解释为ds中存储的观察表的“关键”，但我可以扩展签名以包括一个另外的（新）变量D来表示行的A - ，B - 和C - 值的特定组合出现在数据集“到目前为止”的次数（即当一个迭代数据集中的行/观察时）。但是，要做到这一点，我需要能够遍历ds的“行”，我无法弄清楚如何做。

有人可以告诉我如何迭代数据集的行（观察）吗？

（我意识到这样的迭代可能很慢，但如果我要构建新的D“变量”/“列”，我无法想到任何方法。）

Answer 1

您可以将数据集视为标准的matlab结构，如documentation中的示例所示。因此，对于您的情况，您可以按照以下方式执行某些操作：

count = containers.Map();
D = cell(1, size(ds,1));
for i = 1:size(ds,1) %for each observation
    signature = [ds.A(i) ds.B(i) ds.C(i)]; %or however you wish to generate the sig

    occurences = 1; %assuming we have not seen it before, we have one occurence
    if count.isKey(signature) %if we have seen it before
        occurences = count[signature] + 1; %add one to the old count
    end
    D{i} = [signature occurences]; %make the complete signature
    count[signature] = occurences; %update our count map
end
ds.D = D; %add the new variable to the map.

Answer 2

假设ds是一个包含3列的二维矩阵，您只需使用for循环

for rowIndex = 1:size(ds,1) % number of rows
    A = ds(ri,1); 
    B = ds(ri,2);
    C = ds(ri,3);
    % ....
end

如何迭代数据集的“行”（又名“观察”）

2 个答案: