Question

假设我有这段代码：

import numpy as np
import time
from datetime import datetime

class Measurements():
    def __init__(self, time_var, value):
        self.time_var = time_var
        self.value = value

a = np.array([ Measurements('30-01-2017 12:02:15.880922', 100),
               Measurements('30-01-2017 12:02:16.880922', 100),
               Measurements('30-01-2017 12:02:17.880922', 110),
               Measurements('30-01-2017 12:02:18.880922', 99),
               Measurements('30-01-2017 12:02:19.880922', 96)])


b = np.array([ Measurements('30-01-2017 12:02:15.123444', 10),
               Measurements('30-01-2017 12:02:18.880919', 12),
              ])

所以，我有来自a的5次测量和来自b的2次测量。

我希望以a为基础，在b发生的特定时间找到缺失的a值。

因此，最终b将始终具有a时间值和长度。（当时，我考虑过以time.mktime(datetime.strptime(s, "%d-%m-%Y %H:%M:%S.%f").timetuple())为单位返回时间

所以，b将是：

np.array([ Measurements('30-01-2017 12:02:15.880922', MISSING_VALUE),
               Measurements('30-01-2017 12:02:16.880922', MISSING_VALUE),
               Measurements('30-01-2017 12:02:17.880922', MISSING_VALUE),
               Measurements('30-01-2017 12:02:18.880922', MISSING_VALUE),
               Measurements('30-01-2017 12:02:19.880922', MISSING_VALUE)])

现在，我不知道如何处理这个问题。

一种想法是首先执行interp as here并将b长度拉伸为等于a。

或使用interp1d（更灵活）：

from scipy import interpolate

a = np.array([100, 123, 123, 118, 123])
b = np.array([12, 11, 14, 13])

b_interp = interpolate.interp1d(np.arange(b.size),b, kind ='cubic', assume_sorted=False)
b_new = b_interp(np.linspace(0, b.size-1, a.size))

但是，如何处理时间？

Answer 1

以下是您的问题的解决方案：

首先，如果你使用三次插值，你需要至少4个a值和4个值b（scipy.interpolate.interp1d kind="cubic"不能正常工作）
第二，您不能使用scipy.interpolate.interp1d插入不在您定义的范围内的值（b次的范围）

我更改了您的初始代码以显示：

time_a_full = ['30-01-2017 12:02:15.880922','30-01-2017 12:02:16.880922','30-01-2017 12:02:17.880922','30-01-2017 12:02:18.880922','30-01-2017 12:02:19.880922','30-01-2017 12:02:22.880922']
time_b_full = ['30-01-2017 12:02:15.123444','30-01-2017 12:02:16.880919','30-01-2017 12:02:18.880920', '30-01-2017 12:02:19.880922','30-01-2017 12:02:20.880922']

# Here I transform the time in seconds as suggested
time_a = np.array([time.mktime(datetime.strptime(s, "%d-%m-%Y %H:%M:%S.%f").timetuple()) for s in time_a_full])
time_b = np.array([time.mktime(datetime.strptime(s, "%d-%m-%Y %H:%M:%S.%f").timetuple()) for s in time_b_full])

values_a = np.array([100,100,110,99,96,95])
values_b = np.array([10,12,13,16,20])

# result of the linear interp with the numpy function
np.interp(time_a, time_b, values_b)

# result of the cubic interpolation
f = interpolate.interp1d(time_b,values_b, kind="cubic")
time_a[time_a<time_b.min()]=time_b.min() # use this to stay on range define by the times of b
time_a[time_a>time_b.max()]=time_b.max() # use this to stay on range define by the times of b
f(time_a)

插值到特定时间

1 个答案: