我有一个有点大的数据集(48000 * 53),我试图在其上找到多变量异常值。
但是每次我尝试一个函数如trimmean(),leverage()时我都会遇到同样的错误。
我的数据集应该是NaN值的清除,但我仍然试图运行代码D(find(sum(isnan(D),2)==0),:);
和D(any(isnan(D),2), :)=[];
,但我得到了相同的错误....
答案 0 :(得分:0)
以下示例从数据集中删除包含NaN元素的所有行。
该示例基于以下帖子:Matlab. Replace missed values with an avg
检查以下代码示例:
%Create dataset for the example.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
VarName1 = [4; NaN; 6; 7; 6; 6; 6; 5; 5; 6; 6];
VarName2 = [2; 2; 2; 3; 3; 2; NaN; 2; 2; NaN; 3];
VarName3 = {'aa'; 'aa'; 'aa'; 'bbb'; 'bbb'; 'ccc'; 'ccc'; 'ccc'; 'ccc'; 'dddd'; 'dddd'};
D = dataset(VarName1, VarName2, VarName3);
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%Convert dataset to cell array.
C = dataset2cell(D);
%Find all indexes of NaN elements (use anonymous function).
nanIdx = cellfun(@(x)(any(isnan(x))), C);
%Find indexes of rows with NaN elements
nanRows = any(nanIdx,2);
%Keep only rows without NaN elements.
C = C(~nanRows, :);
D = cell2dataset(C);