我在单元格数组中有一组数据,其中一部分如下所示。第2行和第3行的前三列相同。在第4列的第2行中,它包含已捕获的P0702在第3行(与第2行的前三列相同)。所以我想删除第2行。类似地,第5行和第5行在前三列中具有相同的数据。第五行中的P0882和P0702也存在于第六行,所以我想删除第5行。
重复之前的数据
'1FA' 2 'Fm' [] [] [] [] [] [] [] 'P2700' []
'1Fc' 2 'Fz' [] [] [] 'P0702' [] [] [] [] []
'1Fc' 2 'Fz' [] 'P0702' 'P0801' [] [] [] [] [] []
'1Fj' 8 'Fr' 'P0702' [] [] [] [] [] [] [] []
'1FAH' 2 'Fo' [] [] [] [] [] [] 'P0882' 'P0702' []
'1FAH' 2 'Fo' [] [] [] [] [] [] 'P0882' 'P0702' 'P2700'
重复后的数据
'1FA' 2 'Fm' [] [] [] [] [] [] [] 'P2700' []
'1Fc' 2 'Fz' [] 'P0702' 'P0801' [] [] [] [] [] []
'1Fj' 8 'Fr' 'P0702' [] [] [] [] [] [] [] []
'1FAH' 2 'Fo' [] [] [] [] [] [] 'P0882' 'P0702' 'P2700'
任何帮助都会很棒。
答案 0 :(得分:0)
首先阅读我认为应该可以在2行或3行中使用的问题,但需要一些代码才能解决:
M={'1FA' 2 'Fm' [] [] [] [] [] [] [] 'P2700' []
'1Fc' 2 'Fz' [] [] [] 'P0702' [] [] [] [] []
'1Fc' 2 'Fz' [] 'P0702' 'P0801' [] [] [] [] [] []
'1Fj' 8 'Fr' 'P0702' [] [] [] [] [] [] [] []
'1FAH' 2 'Fo' [] [] [] [] [] [] 'P0882' 'P0702' []
'1FAH' 2 'Fo' [] [] [] [] [] [] 'P0882' 'P0702' 'P2700' }
%r contains the number of nonempty cells, you want those with highest r
r=sum(cellfun(@(x)~isempty(x),(M(:,4:end))),2);
%Create a index matrix which maps each string of first and third column to
%a double, which allows to use unique.
[~,~,index]=unique(M(:,1));
index(:,2)=[M{:,2}];
[~,~,index(:,3)]=unique(M(:,3));
%fill fourth colum with consecutive numbers, used to restore original
%ordering
index(:,4)=1:size(index,1);
%Next two lines, sort index to have rows with highetst r first
[~,sorted_most_content]=sort(-r);
index=index(sorted_most_content,:);
%Now first three columns of index should be unique and the best choice
%comes first, finally unique can be used.
[~,indices_unique_content,~]=unique(index(:,1:3),'rows');
%use previously appended consecutive numbers to get line numbers we want.
%sort restores original ordering.
unique_content_inorder=sort(index(indices_unique_content,4));
%The data you want:
M(unique_content_inorder,:)