Question

我处理沉积物颗粒的大小分布。

row : 1~50
column : 1~10

commonly, row 1 :

[0  0   0   0   0   0   0   0   0   0   0   0.002   0.014 0.010 0.015   0
0.020   0.073   0   0   0   0   0   0   0   0   0   0   9.104 0 0   0   0   0   
0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]

几乎所有行都有这种趋势.. 在这里，我必须删除异常值。

删除方法（Richard Styles，2013）。通过首先计算由Y_i+1 - Y_i定义的差异来识别异常值，其中Y_i是'它的数据点，然后删除大于R乘以平均值绝对值的所有值对于给定的配置文件，这种差异。 R可调整..

接下来。我不知道。如何使用之前的RM_point值替换Row Row。

Dif = abs(diff(st2.ex_dep_1m(1,26:75), 1, 2)); 
M_Dif = mean(Dif, 2); 
RM_point = find(Dif(1,:) >= M_Dif*3);
st2.ex_dep_1m(1, RM_point(1,2))

0   0   0   0   0   0   0   0   0   0   0   0.002   0.014 0.010 0.015   0   
0.020   0.073   0   0   0   0   0   0   0   0   0   0   9.104 0 0   0   0   0   
0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0

在此矩阵中，9.104是异常值。因此，我想将9.10 4替换为0，但是，还有其他情况我们必须考虑这种情况。

ex1）如果有多个RM_point
ex2）实施diff功能时，Row中的值相等因为来自Row [0 9 0]的{{1}}会导致diff [-9 9]，但实际上，abs [9 9]只有一个9

Answer 1

使代码更通用，并能够捕获案例

ex1）如果有多个RM_point实例

ex2）实现diff函数时，Row中的值相等。   因为从Row [0 9 0]跟随diff [-9 9]导致abs [9 9]，但实际上，Row只有一个9

您应该在原始数据集中搜索值>= M_Dif*3（RM_point点），而不是Dif数组中的值。

在下面您可以找到更新的代码;为了更容易理解结果，我修改了输入数据如下：

删除了一些0以使数组“更短”
插入了一些要删除的附加值（3.12 9.104）
更新后的数组现在包含三个连续的9.104值，以便回答ex1和ex2您提问的案例

此外，在代码中，我设置了用于将RM_point再次替换为阈值（M_Dif*3）的值，只是为了使relsult更容易理解。

% Modified input row:
%   some "0" removed
%   replaced 4 "0" with 3.21 9.104   9.104 9.104
%
st2.ex_dep_1m=[  0   0   0.002   0.014 0.010 0.015   0, ...
   0.020   0.073   0     0   3.21   0   0   9.104   9.104 9.104 0   0   0   0, ...
   0    0   0   0   0]
% Create a copy of the original row to make easier verify the result
st2.ex_dep_1m_CLEAN=st2.ex_dep_1m

Dif = abs(diff(st2.ex_dep_1m, 1, 2)); 
M_Dif = mean(Dif, 2); 
% Identify the points to be removed within the original row
% RM_point = find(Dif(1,:) >= M_Dif*3);
RM_point = find(st2.ex_dep_1m >= M_Dif*3);
% Set the points to be removed to the threshold (to better visualize them)
% st2.ex_dep_1m(1, RM_point(1,2))
st2.ex_dep_1m_CLEAN(1, RM_point)=M_Dif*3

% Plot the Original and the Modified row
a1=subplot(2,1,1)
plot(st2.ex_dep_1m,'o','markerfacecolor','k')
legend('Original')
grid on
a2=subplot(2,1,2)
plot(st2.ex_dep_1m_CLEAN,'o','markerfacecolor','r')
grid on
legend('Cleaned')
% Adjust the ylim
a1y=get(a1,'ylim')
a2y=get(a1,'ylim')
y_lim=[min([a1y a2y]) max([a1y a2y])]
set([a1 a2],'ylim',y_lim)

希望这有帮助。

用其先前的值替换特定值

1 个答案: