我有一个数据框(df),我想每天按组对数据进行插值/按比例缩小。我知道如何使用带有'interpolate'的熊猫来执行此操作,但是插值的类型受scipy.interpolate.interp1d的限制,但是我想使用Whittaker平滑器。有什么办法可以做到
df.groupby(['ID']).apply(lambda x: x.resample('1D').first().interpolate(method = my-custom-function-Whittaker_smoother))
df= pd.DataFrame({"ID":[1, 2 , 3 , 4, 1, 2 , 3 , 4, 1, 2 , 3 , 4],
"date":['2015-04-09', '2015-04-09', '2015-04-09', '2015-04-09', '2015-06-03', '2015-06-03', '2015-06-03', '2015-06-03', '2015-06-08', '2015-06-08', '2015-06-08', '2015-06-08'],
"V_1n":[0.2, 0.5, 0.8, 0.4, 0.9, 0.5, 3.0, 5.0, 0.0, 5.0, 0.0, 0.4]})
df['date'] = pd.to_datetime(df['date'], format="%Y-%m-%d")
df.set_index('date', inplace= True)
df_interpolation = df.groupby(['ID']).apply(lambda x: x.resample('1D').first().interpolate())
我已经准备好功能:
import scipy as sp
import scipy.sparse
import scipy.linalg
from scipy.sparse.linalg import cg
def Whittaker_smoother(y, lmda):
m = len(y)
E = sp.sparse.identity(m)
d1 = -1 * np.ones((m),dtype='d')
d2 = 3 * np.ones((m),dtype='d')
d3 = -3 * np.ones((m),dtype='d')
d4 = np.ones((m),dtype='d')
D = sp.sparse.diags([d1,d2,d3,d4],[0,1,2,3], shape=(m-3, m), format="csr")
z = sp.sparse.linalg.cg(E + lmda * (D.transpose()).dot(D), y)
return z[0]
从这里获取:https://gist.github.com/zmeri/3c43d3b98a00c02f81c2ab1aaacc3a49