是否有可能构建一个自动插值的Pandas系列?

时间:2016-12-01 20:08:29

标签: python pandas

是否有可能为任何给定的索引生成一个插值其值的系列。我有一个我希望规定的预定义插值方案,我宁愿调用者自己不应用插值,以避免任何错误的可能性。

class InterpolatedSeries(pd.Series):
    pass # magic?

s = pd.Series([1, 3], index=[1, 3])
i = InterpolatedSeries(s, forward='nearest', backward='nearest', middle='linear')

调用者将收到i,他们现在可以请求任何值,我相信他们获得的值符合规定的插值方案。插值肯定不是可预先计算的(因为我们不知道他们会提前请求哪些点)或可缓存(因为我们不知道他们会要求多少点),但重要的是没有并发症对于来电者。

这可能吗?

>>> i[[0, 0.11234, 1, 2, 2.367, 3, 4]]
... pd.Series([1, 1, 1, 2, 2.367, 3, 3], index=[0, 0.11234, 1, 2, 2.367, 3, 4])

1 个答案:

答案 0 :(得分:5)

使用__getitem__。它被称为python魔术方法http://www.diveintopython3.net/special-method-names.html

class InterpolatedSeries(pd.Series):
    def __init__(self, values, forward='nearest', backward='nearest', middle='linear'):
        super().__init__(values)
        self.forward = forward
        self.backward = backward
        self.middle = middle

    def __getitem__(self, key):
        # get the stored values
        values = super().__getitem__(key)
        # Do interpolation
        return values

class InterpolatedSeries(pd.Series):
    def __init__(self, values, forward='nearest', backward='nearest', middle='linear'):
        super().__init__(values)
        self.forward = forward
        self.backward = backward
        self.middle = middle

    def __setitem__(self, key, value):
        # Do interpolation
        super().__setitem__(key, value)

另一种选择是创建自己的类,与底层数据结构交互。此类不会从pd.Series继承,而是从对象继承。

class InterpolatedSeries(object):
    def __init__(self, values, forward='nearest', backward='nearest', middle='linear'):
        self.data = values
        self.forward = forward
        self.backward = backward
        self.middle = middle

    def __getitem__(self, key):
        values = self.data.__getitem__(key)
        # Do interpolation
        return values

    def __getattribute__(self, key): # maybe __getattr__ if this doesn't work
        """Return the stored pandas series item if the method or attribute was not found. This allows your to_csv method to work"""
        try:
            return super().__getattribute__(key)
        except AttributeError:
            pass
        return self.data.__getattribute__(key) # Call the stored pandas series method if not found.

    def __dir__(self):
        """Return the list of attributes. (Most code autocomplete features use this, so this will find your pandas series methods for autocomplete in IDEs). """
        values = dir(self.data)
        return values + super().__dir__()

以上可能不是最好的方法,但它确实增加了一些灵活性,使其更容易在后台访问pandas系列方法。