Question

我有一个简单的矩阵，在某些列中有重复值。我需要按名称和星期对数据进行分组，并将每周花费的数量相加一周。这是一个例子：

 name day  week  price
 John 12   12    200
 John 14   12    70
 John 25   13    150
 John 1    14    10
 Ann  13   12    100
 Ann  15   12    100
 Ann  20   13    50

所需的输出是：

  name week sum
  John 12   270
  John 13   150
  John 14   10
  Ann  12   200
  Ann  13   50

有一个很好的方法吗？我用过循环，但不确定它是最好的方法：

names= unique(data(:,1)); % getting unique names from data
n=size(names, 1);         % number of unique names
m=size(data(:,1),1);      % number of total rows
sum=[];                   % empty matrix for writing the results
for i = 1:n             
        temp=[];          % creating temporar matrix  
        k=1;
    for j=1:m
        if name(i)==data(j,1)     % going through all the rows and getting the rows of 
            temp(k,:)=data(j,:);  % the same name and putting in temporar matrix
            k=k+1;
        end
    end
    count=0;
    s=1;
    for l = 1:size(temp,1)-1      % going through temporar matrix of one name(e.g.John)
        if temp(l,3)==temp(l+1,3) % checking if the day of current row is equal to the
         count=count+temp(l,4);   % date of the next row (the data is sorted by name 
        else                      % and date) and then summing the prices 4th column
            sum(s, 1:3)=[names(i) temp(l,3) count];  
            count=0;              % if the days are not equal, then writing the answer
            s=s+1;                % to the output matrix sum
        end        
    end 
end

Answer 1

使用accumarray。它会像这样分组和聚合值。您可以使用unique(data(:,1))中的第三个otuput参数来将数字索引传递给subs的{{1}}参数。有关详细信息，请参阅accumarray。

Answer 2

最简单的方法可能是使用统计工具箱中的GRPSTATS功能。您必须首先合并name和week才能生成群组：

[name_week priceSum] = grpstats(price, strcat(name(:), '@', week(:)), {'gname','sum'});

在Matlab中进行分组和总结

2 个答案: