在Matlab中进行分组和总结

时间:2012-03-28 18:07:25

标签: matlab for-loop count sum summary

我有一个简单的矩阵,在某些列中有重复值。我需要按名称和星期对数据进行分组,并将每周花费的数量相加一周。这是一个例子:

 name day  week  price
 John 12   12    200
 John 14   12    70
 John 25   13    150
 John 1    14    10
 Ann  13   12    100
 Ann  15   12    100
 Ann  20   13    50

所需的输出是:

  name week sum
  John 12   270
  John 13   150
  John 14   10
  Ann  12   200
  Ann  13   50

有一个很好的方法吗?我用过循环,但不确定它是最好的方法:

names= unique(data(:,1)); % getting unique names from data
n=size(names, 1);         % number of unique names
m=size(data(:,1),1);      % number of total rows
sum=[];                   % empty matrix for writing the results
for i = 1:n             
        temp=[];          % creating temporar matrix  
        k=1;
    for j=1:m
        if name(i)==data(j,1)     % going through all the rows and getting the rows of 
            temp(k,:)=data(j,:);  % the same name and putting in temporar matrix
            k=k+1;
        end
    end
    count=0;
    s=1;
    for l = 1:size(temp,1)-1      % going through temporar matrix of one name(e.g.John)
        if temp(l,3)==temp(l+1,3) % checking if the day of current row is equal to the
         count=count+temp(l,4);   % date of the next row (the data is sorted by name 
        else                      % and date) and then summing the prices 4th column
            sum(s, 1:3)=[names(i) temp(l,3) count];  
            count=0;              % if the days are not equal, then writing the answer
            s=s+1;                % to the output matrix sum
        end        
    end 
end  

2 个答案:

答案 0 :(得分:3)

使用accumarray。它会像这样分组和聚合值。您可以使用unique(data(:,1))中的第三个otuput参数来将数字索引传递给subs的{​​{1}}参数。有关详细信息,请参阅accumarray

答案 1 :(得分:1)

最简单的方法可能是使用统计工具箱中的GRPSTATS功能。您必须首先合并nameweek才能生成群组:

[name_week priceSum] = grpstats(price, strcat(name(:), '@', week(:)), {'gname','sum'});