Question

我有以下两个表，Data和Members：

Data = table(sort(repmat(datenum(2001,1:5,1).',4,1)),repmat(('A':'D').',5,1),repmat((201:204).',5,1),'VariableNames',{'Date','ID','Price'});

Data = 

       Date       ID    Price
    __________    __    _____

    7.3085e+05    A     201  
    7.3085e+05    B     202  
    7.3085e+05    C     203  
    7.3085e+05    D     204  
    7.3088e+05    A     201  
    7.3088e+05    B     202  
    7.3088e+05    C     203  
    7.3088e+05    D     204  
    7.3091e+05    A     201  
    7.3091e+05    B     202  
    7.3091e+05    C     203  
    7.3091e+05    D     204  
    7.3094e+05    A     201  
    7.3094e+05    B     202  
    7.3094e+05    C     203  
    7.3094e+05    D     204  
    7.3097e+05    A     201  
    7.3097e+05    B     202  
    7.3097e+05    C     203  
    7.3097e+05    D     204

Members = table(datenum(2001,1:5,1).',{cell2table({'B','C'});table({'A'});cell2table({'B','D'});cell2table({'A','C'});cell2table({'A','C'})},'VariableNames',{'Date','MemberID'});


Members = 

       Date        MemberID  
    __________    ___________

    7.3085e+05    [1x2 table]
    7.3088e+05    [1x1 table]
    7.3091e+05    [1x2 table]
    7.3094e+05    [1x2 table]
    7.3097e+05    [1x2 table]

现在，我想将函数（mean）应用于每个“日期”的Price变量，并仅针对每个ID中的MemberId日期。因此，对于第一个日期，7.3085e + 05，我想要ID 2和3的平均值 - 即（202 + 203）/ 2 - 等。

我能够通过一个非常慢的for循环来实现这一点（我的实际表格要大得多）。我认为这应该可以使用varfun或类似的，但我不能让它工作。有什么想法吗？

更新

将ID更改为文本，因为这是我的真实数据。

Answer 1

方法1

%// Find offsets to be used for linear indexing into elements for each group
off1 = [0 ; find(diff(Data.ID)<0)]

%// Store prices and member IDs as numeric arrays
prc = Data.Price
mem_id = Members.MemberId

%// Convert each MemberID table into arrays; add the corresponding
%// linear indexing offsets and finally index into prc and get the mean values
out = arrayfun(@(n) mean(prc(off1(n)+ char(table2array(mem_id{n}))-'A'+1 )),...
                                                            1:numel(mem_id))

方法2

%// Find offsets to be used for linear indexing into elements for each group
off1 = [0 ; find(diff(Data.ID)<0)]

%// Store prices and member IDs as numeric arrays
prc = Data.Price
mem_id = Members.MemberId

%// Storage for output
out = zeros(1,numel(mem_id))

%// Get the widths(lengths) of each table in MemberID and the unique lengths
lens = arrayfun(@(n) width(mem_id{n}),1:numel(mem_id))
unqlens = unique(lens,'stable')

%// Now, that you have groups of MemberID tables with uniform
%// widths/lengths, you can use table2array on all those tables with single
%// call within each group and thus could be beneficial in performance
for iter = 1:numel(unqlens)
    mask = lens==unqlens(iter); %// Mask for each group
    mem_id_iter = mem_id(mask); %// Member IDs
    grp_ids = reshape(char(table2array(vertcat(mem_id_iter{:}))) - 'A' + 1,[],...
                                                unqlens(iter)); %// group IDs
    lin_idx = bsxfun(@plus,off1(mask),grp_ids); %// linear indices
    out(mask) = mean(prc(lin_idx),2); %// index into prc and get mean values
end

Answer 2

使用ismember我自己找到了以下解决方案：

% Defining the linear indexes according to Members for each date
AAA = arrayfun(@(n) ismember(table2array(Data(Data.Date==table2array(Members(n,'Date')),2)),table2array(Members.MemberID{n})),1:size(Members,1),'UniformOutput',false);

% Defining cell of Data for each date
BBB = arrayfun(@(n) Data(Data.Date==table2array(Members(n,'Date')),:),1:size(Members,1),'UniformOutput',false)

% Applying the linear index AAA on each data (BBB) and taking the average
CCC = arrayfun(@(n,x) mean(table2array(x{:}(n{:},3))),AAA,BBB)

CCC =

  202.5000  201.0000  203.0000  202.0000  202.0000

在Matlab中以几个变量为条件将函数应用于表

2 个答案:

方法1

方法2