我目前正在尝试运行一个采用2D数组并过滤出Nan值和高振幅噪声的程序。
程序通过首先将2D数组解构为行切片,然后按每行的索引来进行此操作,将Nan值和大于数据残差的第一个标准偏差的值替换为来自数组的值。多项式适合相等大小的数据。下面的代码块是程序的主要部分,旨在通过嵌套的for循环来实现:
# I have to remove the first and last columns of data and filter them on their own,
# as per instruction. The next 24 lines of code do this.
data_initial = data[:,0]
data_final = data[:,-1]
# Create Polynomial fit to data, excluding NaNs.
idx = np.isfinite(x) & np.isfinite(data_initial)
coeff_initial = p.polyfit(x[idx], data_initial[idx], 30, full=True)
pfit_initial = p.polyval(x, coeff_initial[0])
idy = np.isfinite(x) & np.isfinite(data_final)
coeff_final = p.polyfit(x[idy], data_final[idy], 30, full=True)
pfit_final = p.polyval(x, coeff_final[0])
# replace NaN values in first and last profiles with their corresponding
# polynomial fit values
for i in range(0,len(data_initial)):
if np.isnan(data_initial[i]) == True:
data_initial[i] = pfit_initial[i]
for i in range(0,len(data_final)):
if np.isnan(data_final[i]) == True:
data_final[i] = pfit_final[i]
data_initial = data_initial.reshape(len(data_initial),1)
data_final = data_final.reshape(len(data_final),1)
data_smooth = np.array([])
for i in range(0,len(x)):
data_set = data[i,1:-1] # "data" is the name of the original 2D array
# The next 3 lines are responsible for fitting a polynomial line to each row of the data
ids = np.isfinite(y[1:-1]) & np.isfinite(data_set)
coeff = p.polyfit(y[1:-1][ids], data_set[ids], 20)
pfit = p.polyval(y[1:-1], coeff)
# The next nested for loop is designed to replace any Nan values with values
# from the polynomial fit line "pfit"
for k in range(0,len(data_set)):
if np.isnan(data_set[k]) == True:
data_set[k] = pfit[k]
# The next 2 lines calculates the residual noise of the data and the 1st standard
# deviation of the data
residuals = data_set - pfit
standard_dev = 1*np.std(residuals)
# The next nested for loop is designed to replace any values inside "residuals"
# with polynomial fit values if they exceed the 1st standard deviation
for j in range(0,len(residuals)):
if abs(residuals[j]) >= abs(standard_dev):
data_set[j] = pfit[j]
# The final four lines inside of the overarching loop reshape the data so that
# each row can be stacked to create a 2D array
if len(data_smooth) == 0:
data_smooth = data_set[None,:]
else:
data_smooth = np.vstack((data_smooth,data_set))
# These 2 lines add the previously sliced first and last columns back to the now
# filtered 2D array of data.
data_smooth_1 = np.hstack((data_initial,data_smooth))
Data_filtered = np.hstack((data_smooth_1,data_final))
但是,当该程序完全运行时(并且这样做没有错误),它似乎对原始数据没有任何作用。原始数据集和“过滤后的”数据集一点一点都相同。当我在嵌套的for循环中重新分配值时,我做错什么了吗?我很茫然,一直在努力解决这个问题好几天了。请帮忙!
更新: 我发现它正在按应有的方式过滤数据,但问题在于它正在重新定义上一个数组的每个元素。为什么会这样?