我正在使用Matlab的fillmissing
function填充缺失的值。
如果您的矩阵看起来如下:
A = rand(10,2);
A(end-5:end,1) = NaN;
% this gives:
A =
0.8147 0.1576
0.9058 0.9706
0.1270 0.9572
0.9134 0.4854
NaN 0.8003
NaN 0.1419
NaN 0.4218
NaN 0.9157
NaN 0.7922
NaN 0.9595
您可以按如下方式应用函数fillmissing:
Afilled = fillmissing(A, 'previous')
然后相应的矩阵将如下所示:
Afilled =
0.8147 0.1576
0.9058 0.9706
0.1270 0.9572
0.9134 0.4854
0.9134 0.8003
0.9134 0.1419
0.9134 0.4218
0.9134 0.9157
0.9134 0.7922
0.9134 0.9595
然而,现在,该功能没有考虑到实际丢失了多少观察数(在这种情况下为6)。
我正在寻找一种方法,在取最后一个值之前考虑观察次数。例如,仅根据最近5次观察填写缺失的观测值:
Afilled2 =
i=1 0.8147 0.1576
i=2 0.9058 0.9706
i=3 0.1270 0.9572
i=4 0.9134 0.4854
i=5 % missing 1 0.9134 0.8003
i=6 % missing 2 0.9134 0.1419
i=7 % missing 3 0.9134 0.4218
i=8 % missing 4 0.9134 0.9157
i=9 % missing 5 0.9134 0.7922
i=10 NaN 0.9595
答案 0 :(得分:1)
MATLAB的fillmissing
函数没有此功能。下面是一些简单的代码来执行您想要执行的操作(使用'previous'
方法填充维度1):
% parameter: maximum number of observations to fill with a given value
max_fill_obs = 5;
% loop over columns
for col = 1 : size(A, 2)
% initialize a counter (the number of previously filled values) to 0
counter = 0;
% loop over rows within column col, starting from the second row
for row = 2 : size(A, 1)
% if the current element is known, reset the counter to 0
if ~isnan(A(row, col))
counter = 0;
% otherwise, if we haven't already filled in max_fill_obs values,
% fill in the value and increment the counter
elseif counter < max_fill_obs
A(row, col) = A(row - 1, col);
counter = counter + 1;
end
end
end
如果有多个NaN值块,只填充每个块中的第一个max_fill_obs值,则此方法有效。例如,尝试在
定义的矩阵上运行它A = rand(20,2);
A(5:10,1) = NaN;
A(13:19,1) = NaN;
这是上述代码的矢量化版本:
Afilled = fillmissing(A, 'previous');
Afilled(movsum(isnan(A), [max_fill_obs, 0]) > max_fill_obs) = NaN;