我的时间序列数据本质上是周期性的(而且是正弦的)。我根据周期重新调整了数据,以便所有点都在0和1之间。你可以想到这与从正弦波从0到其周期2pi采样的点一样。这是一个典型案例:
我尝试使用各种scipy.interpolate
函数插入此数据,例如:
>>> scipy.interpolate.UnivariateSpline(x,y)(numpy.linspace(0, 0.99, 100))
array([ 15.13403109, 15.10173144, 15.07070986, 15.04094629,
15.01242068, 14.98511296, 14.95900308, 14.93407098,
14.91029659, 14.88765987, 14.86614074, 14.84571915,
14.82637504, 14.80808836, 14.79083904, 14.77460702,
14.75937224, 14.74511465, 14.73181418, 14.71945078,
14.70800439, 14.69745494, 14.68778239, 14.67896666,
14.6709877 , 14.66382545, 14.65745985, 14.65187085,
14.64703838, 14.64294238, 14.6395628 , 14.63687957,
14.63487264, 14.63352194, 14.63280742, 14.63270902,
14.63320668, 14.63428034, 14.63590994, 14.63807542,
14.64075672, 14.64393378, 14.64758655, 14.65169496,
14.65623896, 14.66119848, 14.66655347, 14.67228387,
14.67836961, 14.68479064, 14.69152691, 14.69855834,
14.70586488, 14.71342648, 14.72122306, 14.72923458,
14.73744098, 14.74582219, 14.75435815, 14.76302882,
14.77181411, 14.78069399, 14.78964838, 14.79865724,
14.80770049, 14.81675809, 14.82580996, 14.83483606,
14.84381632, 14.85273069, 14.8615591 , 14.87028149,
14.87887781, 14.887328 , 14.895612 , 14.90370974,
14.91160117, 14.91926624, 14.92668487, 14.93383702,
14.94070261, 14.9472616 , 14.95349392, 14.95937952,
14.96489834, 14.9700303 , 14.97475537, 14.97905347,
14.98290455, 14.98628855, 14.98918541, 14.99157507,
14.99343747, 14.99475255, 14.99550026, 14.99566053,
14.9952133 , 14.99413852, 14.99241612, 14.99002605])
例如,使用x是(注意某些值重复):
>>> x
array([ 0. , 0.01 , 0.016, 0.018, 0.024, 0.029, 0.034, 0.036,
0.042, 0.046, 0.048, 0.053, 0.058, 0.062, 0.069, 0.071,
0.072, 0.079, 0.083, 0.091, 0.096, 0.102, 0.102, 0.106,
0.108, 0.111, 0.112, 0.112, 0.122, 0.131, 0.135, 0.136,
0.137, 0.145, 0.164, 0.168, 0.172, 0.174, 0.177, 0.178,
0.179, 0.197, 0.202, 0.205, 0.206, 0.213, 0.215, 0.222,
0.229, 0.233, 0.235, 0.239, 0.239, 0.241, 0.248, 0.255,
0.258, 0.259, 0.262, 0.264, 0.266, 0.267, 0.276, 0.28 ,
0.281, 0.281, 0.285, 0.289, 0.292, 0.292, 0.294, 0.295,
0.299, 0.304, 0.306, 0.309, 0.313, 0.317, 0.32 , 0.32 ,
0.335, 0.34 , 0.341, 0.353, 0.357, 0.359, 0.364, 0.368,
0.369, 0.369, 0.388, 0.39 , 0.394, 0.396, 0.399, 0.401,
0.404, 0.406, 0.407, 0.413, 0.415, 0.418, 0.423, 0.43 ,
0.438, 0.439, 0.443, 0.445, 0.454, 0.455, 0.475, 0.478,
0.478, 0.48 , 0.48 , 0.482, 0.485, 0.486, 0.488, 0.488,
0.498, 0.498, 0.499, 0.508, 0.514, 0.525, 0.527, 0.531,
0.535, 0.536, 0.546, 0.547, 0.551, 0.553, 0.556, 0.563,
0.57 , 0.579, 0.584, 0.59 , 0.594, 0.595, 0.596, 0.606,
0.606, 0.619, 0.628, 0.631, 0.632, 0.633, 0.638, 0.64 ,
0.649, 0.652, 0.654, 0.655, 0.669, 0.674, 0.684, 0.688,
0.689, 0.692, 0.697, 0.697, 0.703, 0.703, 0.703, 0.704,
0.706, 0.715, 0.715, 0.717, 0.72 , 0.721, 0.73 , 0.739,
0.746, 0.75 , 0.751, 0.752, 0.757, 0.762, 0.766, 0.766,
0.783, 0.785, 0.787, 0.79 , 0.791, 0.791, 0.806, 0.809,
0.81 , 0.813, 0.815, 0.816, 0.816, 0.818, 0.82 , 0.823,
0.839, 0.849, 0.857, 0.859, 0.862, 0.864, 0.868, 0.869,
0.875, 0.877, 0.887, 0.888, 0.893, 0.896, 0.905, 0.907,
0.908, 0.925, 0.926, 0.936, 0.947, 0.949, 0.955, 0.957,
0.962, 0.97 , 0.972, 0.976, 0.979, 0.984, 0.985, 0.986,
0.993, 1. ])
和y,例如:
>>> y
array([ 15.048, 15.046, 15.046, 15.037, 15.035, 15.048, 15.034,
15.041, 15.03 , 15.034, 15.037, 15.04 , 15.038, 15.028,
14.998, 14.976, 15.012, 15.007, 14.996, 14.979, 14.922,
14.876, 14.881, 14.931, 14.912, 14.904, 14.906, 14.897,
14.871, 14.786, 14.778, 14.78 , 14.782, 14.788, 14.729,
14.735, 14.661, 14.722, 14.668, 14.657, 14.715, 14.647,
14.607, 14.627, 14.607, 14.625, 14.619, 14.592, 14.583,
14.596, 14.596, 14.595, 14.584, 14.593, 14.601, 14.597,
14.605, 14.596, 14.61 , 14.6 , 14.582, 14.609, 14.606,
14.619, 14.601, 14.612, 14.619, 14.612, 14.612, 14.618,
14.619, 14.62 , 14.62 , 14.619, 14.633, 14.629, 14.611,
14.62 , 14.629, 14.618, 14.645, 14.634, 14.633, 14.644,
14.647, 14.649, 14.67 , 14.661, 14.658, 14.67 , 14.667,
14.682, 14.676, 14.675, 14.68 , 14.67 , 14.673, 14.676,
14.68 , 14.654, 14.689, 14.699, 14.694, 14.691, 14.699,
14.703, 14.683, 14.691, 14.706, 14.703, 14.715, 14.73 ,
14.727, 14.72 , 14.729, 14.718, 14.712, 14.721, 14.734,
14.722, 14.738, 14.724, 14.73 , 14.729, 14.735, 14.751,
14.741, 14.752, 14.753, 14.765, 14.758, 14.759, 14.766,
14.766, 14.774, 14.774, 14.768, 14.775, 14.789, 14.788,
14.793, 14.787, 14.783, 14.808, 14.789, 14.793, 14.804,
14.804, 14.793, 14.805, 14.808, 14.811, 14.825, 14.816,
14.827, 14.827, 14.827, 14.838, 14.83 , 14.839, 14.848,
14.844, 14.834, 14.838, 14.845, 14.861, 14.856, 14.847,
14.853, 14.868, 14.845, 14.857, 14.859, 14.859, 14.868,
14.853, 14.871, 14.873, 14.875, 14.893, 14.882, 14.883,
14.884, 14.899, 14.904, 14.907, 14.909, 14.903, 14.909,
14.909, 14.91 , 14.911, 14.904, 14.909, 14.933, 14.923,
14.924, 14.907, 14.928, 14.913, 14.939, 14.944, 14.946,
14.952, 14.935, 14.946, 14.943, 14.948, 14.952, 14.957,
14.974, 14.981, 14.967, 14.967, 14.977, 14.987, 14.97 ,
15.013, 14.98 , 15.011, 15.004, 15.013, 15. , 15.017,
15.02 , 15.047, 15.03 , 15.05 , 15.029, 15.043, 15.038,
15.03 , 15.042, 15.052])
该函数应该评估为(几乎)0处的相同数字为1,因为基础数据是周期性的(正如我们所期望的那样,插入正弦的函数在0处具有与2pi相同的值)。然而,它显然具有较长的左偏斜,并且与0附近的数据不完全相似.0和1处的值之间的差异大约为0.144,这大于数据集的标准偏差。
有什么想法?在设置固定点时,我可以以某种方式插值,即边界的开始和结束应该大致相同的规范吗?
答案 0 :(得分:2)
splrep
/ splev
对功能声称支持定期样条线,c.f。 per
参数。
UnivariateSpline中没有它是一个错误。这是一个最小的实现(但最好不要使用它,因为访问_data
可能不向后兼容):
from scipy.interpolate import UnivariateSpline, splrep
class PeriodicUnivariateSpline(UnivariateSpline):
def __init__(self, x, y, w=None, bbox=[None]*2, k=3, s=0):
#_data == x,y,w,xb,xe,k,s,n,t,c,fp,fpint,nrdata,ier
tck, fp, ier, msg = splrep(x, y, k=k, w=w, xb=bbox[0], xe=bbox[1],
s=s, per=1, full_output=1)
self._data = (x,y,w,bbox[0],bbox[1],k,s,len(tck[0]),tck[0],tck[1],
fp,None,None,ier)
self._reset_class()