使用自定义功能的熊猫插值

时间:2020-05-04 17:06:14

标签: python pandas scipy interpolation

我有一个数据框(df),我想每天按组对数据进行插值/按比例缩小。我知道如何使用带有'interpolate'的熊猫来执行此操作,但是插值的类型受scipy.interpolate.interp1d的限制,但是我想使用Whittaker平滑器。有什么办法可以做到

df.groupby(['ID']).apply(lambda x: x.resample('1D').first().interpolate(method = my-custom-function-Whittaker_smoother))



df= pd.DataFrame({"ID":[1, 2 , 3 , 4, 1, 2 , 3 , 4, 1, 2 , 3 , 4], 
                        "date":['2015-04-09', '2015-04-09', '2015-04-09', '2015-04-09', '2015-06-03', '2015-06-03', '2015-06-03', '2015-06-03', '2015-06-08', '2015-06-08', '2015-06-08', '2015-06-08'], 
                        "V_1n":[0.2, 0.5, 0.8, 0.4, 0.9, 0.5, 3.0, 5.0, 0.0, 5.0, 0.0, 0.4]})
df['date'] = pd.to_datetime(df['date'], format="%Y-%m-%d") 
df.set_index('date', inplace= True)
df_interpolation = df.groupby(['ID']).apply(lambda x: x.resample('1D').first().interpolate())

我已经准备好功能:

import scipy as sp
import scipy.sparse
import scipy.linalg
from scipy.sparse.linalg import cg

def Whittaker_smoother(y, lmda):
  m = len(y)
  E = sp.sparse.identity(m)
  d1 = -1 * np.ones((m),dtype='d')
  d2 = 3 * np.ones((m),dtype='d')
  d3 = -3 * np.ones((m),dtype='d')
  d4 = np.ones((m),dtype='d')
  D = sp.sparse.diags([d1,d2,d3,d4],[0,1,2,3], shape=(m-3, m), format="csr")
  z = sp.sparse.linalg.cg(E + lmda * (D.transpose()).dot(D), y)

  return z[0]

从这里获取:https://gist.github.com/zmeri/3c43d3b98a00c02f81c2ab1aaacc3a49

0 个答案:

没有答案