Question

我的部分数据（字符串的单元格数组）如下所示。我想计算特定字符串的出现次数（例如'P0702'，'P0882'等），并以下面显示的输出形式显示出现的总和：

'1FA'   '2012'  'F' ''  ''  ''  ''  ''  'P0702' 'P0882' 
'1Fc'   '2012'  'r' ''  ''  ''  ''  ''  'P0702' ''  ''  ''  
'1FA'   '2012'  'f' ''  ''  ''  ''  ''  'P0702' 'P0882' ''  
'1FA'   '2012'  'y' ''  ''  ''  'P0702' ''  ''  ''  ''  ''  
'1FA'   '2012'  'g' ''  ''  ''  ''  ''  ''  ''  ''  ''  ''  
'1FA'   '2012'  'u' ''  'P0702' 'P0882' ''  ''  ''  ''  ''  
'1FA'   '2012'  'y' ''  'P0702' ''  ''  ''  ''  ''  ''  ''  
'1FA'   '2012'  'n' ''  'P0702' ''  ''  ''  ''  ''  ''  ''  
'1FA'   '2012'  'j' ''  ''  ''  ''  ''  ''  ''  ''  'P0702'                                
'1FA'   '2012'  'u' 'P0702' ''  ''  ''  ''  ''  ''  ''  ''  
'1FM'   '2013'  'x' ''  ''  ''  ''  ''  'P1921' ''  ''  ''
'1FM'   '2013'  'c' ''  'P1711' ''  ''  ''  ''  ''  ''  ''
'1FM'   '2013'  'c' ''  ''  ''  ''  ''  'P0702' 'P0882' ''
'1FM'   '2009'  'E' ''  ''  ''  ''  ''  ''  ''  'P0500'

输出：

        sum of counts above      
P0702   15
P0500    1
P1711    1

等等。

我尝试使用sum(strcmp(d,{'P0882'}),2);告诉我'P0882'出现了多少次，但是很难将它用于每个数据字符串。

Answer 1

您可以执行以下操作，基本上按照您的建议应用strcmp，但是在预先确定要计算的唯一字符串/数据名称的循环中。

我修改了你提供的数据，使尺寸合适。代码被评论并且非常容易理解：

C = {'1FA'   '2012'  'F' ''  ''  ''  ''  ''  'P0702' 'P0882' ;
'1Fc'   '2012'  'r' ''  ''  ''  ''  ''  'P0702' '';
'1FA'   '2012'  'f' ''  ''  ''  ''  ''  'P0702' 'P0882';
'1FA'   '2012'  'y' ''  ''  ''  'P0702' ''  ''  '';
'1FA'   '2012'  'g' ''  ''  ''  ''  ''  ''  '';
'1FA'   '2012'  'u' ''  'P0702' 'P0882' ''  ''  ''  ''  ;
'1FA'   '2012'  'y' ''  'P0702' ''  ''  ''  ''  '' ;
'1FA'   '2012'  'n' ''  'P0702' ''  ''  ''  ''  '' ;
'1FA'   '2012'  'j' ''  ''  ''  ''  ''  ''  'P0702' ;  
'1FA'   '2012'  'u' 'P0702' ''  ''  ''  ''  '' '' ;
'1FM'   '2013'  'x' ''  ''  ''  ''  ''  'P1921' '';
'1FM'   '2013'  'c' ''  'P1711' ''  ''  ''  ''  '';
'1FM'   '2013'  'c' ''  ''  ''  ''  ''  'P0702' 'P0882';
'1FM'   '2009'  'E' ''  ''  ''  ''  ''  '' 'P0500'}

%// Find unique strings to count occurence of.
[strings,~,~] = unique(C(:,4:end));

%// Remove empty cells automatically.
strings = strings(~cellfun(@isempty,strings));

%// Initialize output cell array
Output = cell(numel(strings),2);

%// Count occurence. You can combine the 2 lines into one using concatenation.
for k = 1:numel(strings)

    Output{k,1} = strings{k};    
    Output{k,2} = sum(sum(strcmp(C(:,4:end),strings{k})));

end

让我们做一个很好的表格：

T = table(Output(:,2),'RowNames',Output(:,1),'VariableNames',{'TotalOccurences'})

输出：

T = 

             TotalOccurences
             _______________

    P0500    [ 1]           
    P0702    [10]           
    P0882    [ 4]           
    P1711    [ 1]           
    P1921    [ 1]

如果您无法访问table函数，则可以创建带标题的单元格数组并更改循环：

%// Initialize output cell array
Output = cell(numel(strings)+1,2);

%// Count occurence
for k = 1:numel(strings)

    Output{k+1,1} = strings{k};    
    Output{k+1,2} = sum(sum(strcmp(C(:,4:end),strings{k})));

end
%T = table(Output(:,2),'RowNames',Output(:,1),'VariableNames',{'TotalOccurences'})

Output(1,:) = {'Data' 'Occurence'}

输出：

Output = 

    'Data'     'Occurence'
    'P0500'    [        1]
    'P0702'    [       10]
    'P0882'    [        4]
    'P1711'    [        1]
    'P1921'    [        1]

Answer 2

如果您拥有统计工具箱，则只需使用tabulate

if ( User.Identity.IsAuthenticated == true)
{
    // Authenticated user...do something
}
else
{
   // anonymous..do something different
}

它已经提供了格式良好的输出：

%// get only relevant part
X = data(:,4:end);

%// tabulate
tabulate(X(:))

或者使用标准功能：

  Value    Count   Percent
  P0702       10     58.82%
  P1711        1      5.88%
  P0882        4     23.53%
  P1921        1      5.88%
  P0500        1      5.88%

Answer 3

您可以在没有循环的情况下计算所有字符串的出现次数。让C成为您的单元格数组。

[uniqueStrings, ~, v] = unique(C);
counts = histc(v, 1:max(v));
result = [uniqueStrings(:) num2cell(counts(:))];

在你的例子中，这给出了

result = 
    ''         [81]
    '1FA'      [ 9]
    '1FM'      [ 4]
    '1Fc'      [ 1]
    '2009'     [ 1]
    '2012'     [10]
    '2013'     [ 3]
    'E'        [ 1]
    'F'        [ 1]
    'P0500'    [ 1]
    'P0702'    [10]
    'P0882'    [ 4]
    'P1711'    [ 1]
    'P1921'    [ 1]
    'c'        [ 2]
    'f'        [ 1]
    'g'        [ 1]
    'j'        [ 1]
    'n'        [ 1]
    'r'        [ 1]
    'u'        [ 2]
    'x'        [ 1]
    'y'        [ 2]

计算并显示出现的总和

3 个答案: