基于具有NaN的列对单元阵列进行排序

时间:2014-03-31 19:45:36

标签: matlab sorting

我的数据是一个名为PM25的1x7单元格。在每个细胞内,有另一个细胞,其大小为365x5xN,其中N变化。以下是PM25 {1,1}的一部分(数据可以在这里找到:https://www.dropbox.com/sh/li3hh1nvt11vok5/4YGfwStQlo。有问题的变量是PM25)

'42.493056'    '-92.343889'    '19-013-0008'    [733043]    [ NaN]
'42.493056'    '-92.343889'    '19-013-0008'    [733044]    [ NaN]
'42.493056'    '-92.343889'    '19-013-0008'    [733045]    '3.6' 
'42.493056'    '-92.343889'    '19-013-0008'    [733046]    [ NaN]
'42.493056'    '-92.343889'    '19-013-0008'    [733047]    [ NaN]
'42.493056'    '-92.343889'    '19-013-0008'    [733048]    '10'  
'42.493056'    '-92.343889'    '19-013-0008'    [733049]    [ NaN]
'42.493056'    '-92.343889'    '19-013-0008'    [733050]    [ NaN]
'42.493056'    '-92.343889'    '19-013-0008'    [733051]    '5.8' 
'42.493056'    '-92.343889'    '19-013-0008'    [733052]    [ NaN]
'42.493056'    '-92.343889'    '19-013-0008'    [733053]    [ NaN]
'42.493056'    '-92.343889'    '19-013-0008'    [733054]    '7.7' 

我试图通过最后一列浓度对整个细胞进行分类。这就是我一直在做的事情:

% Sort each site based on the concentration values - Descending order with NaN's at the bottom
for i = 1:length(names_PM25_O3) % States
    for j = 1:length(PM25{i}(1,1,:)) % Number of sites
        [~,ix] = sort(str2double(PM25{i}(:,5,j))); % Sorted indices
        nanmask = isnan(str2double(PM25{i}(ix,5,j))); % Get mask (0 or 1) of nan-rows to be ignored
        ix = flipdim(ix(~nanmask),1); % Get non-nan indices in reverse order
        PM25_sorted{i} = PM25{i}(ix,:,:); % Sort
    end
end

问题在于,此代码仅对PM25中每个7个单元格中的最后一个N进行排序。所有其他N根据最后的N进行排序,最终得到的值少于365,可能是因为在最后N中删除了NaN。

例如,这里将是N = 1的一部分(PM25 {1,1}(:,:,1))

'42.493056'    '-92.343889'    '19-013-0008'    [733396]    '63'  
'42.493056'    '-92.343889'    '19-013-0008'    [733393]    '37.5'
'42.493056'    '-92.343889'    '19-013-0008'    [733108]    '28.7'
'42.493056'    '-92.343889'    '19-013-0008'    [733207]    '23.1'
'42.493056'    '-92.343889'    '19-013-0008'    [733366]    '27.7'
'42.493056'    '-92.343889'    '19-013-0008'    [733255]    '19.2'
'42.493056'    '-92.343889'    '19-013-0008'    [733063]    '24.7'
'42.493056'    '-92.343889'    '19-013-0008'    [733225]    '11.7'
'42.493056'    '-92.343889'    '19-013-0008'    [733066]    '19.9'
'42.493056'    '-92.343889'    '19-013-0008'    [733250]    [ NaN]
'42.493056'    '-92.343889'    '19-013-0008'    [733387]    '26.5'
'42.493056'    '-92.343889'    '19-013-0008'    [733153]    '15.6'
'42.493056'    '-92.343889'    '19-013-0008'    [733384]    '12.9'

虽然这将是最后一个N的一部分,但是PM25 {1}中的N = 21(:,:,21)

'42.695391'    '-93.655976'    '19-197-0004'    [733396]    '48'  
'42.695391'    '-93.655976'    '19-197-0004'    [733393]    '36.4'
'42.695391'    '-93.655976'    '19-197-0004'    [733108]    '33.3'
'42.695391'    '-93.655976'    '19-197-0004'    [733207]    '25.4'
'42.695391'    '-93.655976'    '19-197-0004'    [733366]    '24.3'
'42.695391'    '-93.655976'    '19-197-0004'    [733255]    '22.4'
'42.695391'    '-93.655976'    '19-197-0004'    [733063]    '21'  
'42.695391'    '-93.655976'    '19-197-0004'    [733225]    '20'  
'42.695391'    '-93.655976'    '19-197-0004'    [733066]    '19.8'
'42.695391'    '-93.655976'    '19-197-0004'    [733250]    '19.6'
'42.695391'    '-93.655976'    '19-197-0004'    [733387]    '19.5'
'42.695391'    '-93.655976'    '19-197-0004'    [733153]    '19.2'
'42.695391'    '-93.655976'    '19-197-0004'    [733384]    '18.8'

如您所见,N = 21按降序排序,所有NaN都消失了。但是N = 1只是按照N = 21的顺序(查看第4列,日期 - 它们的顺序相同),所以它没有按降序排序。

我怎样才能让整个细胞单独分类?我可能不得不保留NaN行,否则每个N都会有不同的长度。目前看起来它们正在从排序的N中删除。

1 个答案:

答案 0 :(得分:1)

功能 -

function sorted_cell_array = sortcell_col5(org_cell_array)

col5 = org_cell_array(:,5);
isnum = cellfun(@isnumeric,col5);
t2 = NaN(size(org_cell_array,1),1);
t2(~isnum) = str2num(char(col5(~isnum)));
[~,y1] = sort(t2);
c1 = nnz(~isnan(t2));
if ~c1
    sorted_cell_array = org_cell_array(y1,:);
else
    ind1 = [ flipud(y1(1:c1)) ; y1(c1+1:end) ];
    sorted_cell_array = org_cell_array(ind1,:);
end

return;

主脚本 -

load data_2007.mat %%// Load your data mat file

PM25_sorted = PM25;
M1 = size(PM25,2);
for k1 = 1:M1
    [sz1,sz2,N] = size(PM25{1,k1});
    for k2 = 1:N
        PM25_sorted{1,k1}(:,:,k2) = sortcell_col5(PM25{1,k1}(:,:,k2));
    end
end