是否有可能为任何给定的索引生成一个插值其值的系列。我有一个我希望规定的预定义插值方案,我宁愿调用者自己不应用插值,以避免任何错误的可能性。
class InterpolatedSeries(pd.Series):
pass # magic?
s = pd.Series([1, 3], index=[1, 3])
i = InterpolatedSeries(s, forward='nearest', backward='nearest', middle='linear')
调用者将收到i
,他们现在可以请求任何值,我相信他们获得的值符合规定的插值方案。插值肯定不是可预先计算的(因为我们不知道他们会提前请求哪些点)或可缓存(因为我们不知道他们会要求多少点),但重要的是没有并发症对于来电者。
这可能吗?
>>> i[[0, 0.11234, 1, 2, 2.367, 3, 4]]
... pd.Series([1, 1, 1, 2, 2.367, 3, 3], index=[0, 0.11234, 1, 2, 2.367, 3, 4])
答案 0 :(得分:5)
使用__getitem__
。它被称为python魔术方法http://www.diveintopython3.net/special-method-names.html
class InterpolatedSeries(pd.Series):
def __init__(self, values, forward='nearest', backward='nearest', middle='linear'):
super().__init__(values)
self.forward = forward
self.backward = backward
self.middle = middle
def __getitem__(self, key):
# get the stored values
values = super().__getitem__(key)
# Do interpolation
return values
或
class InterpolatedSeries(pd.Series):
def __init__(self, values, forward='nearest', backward='nearest', middle='linear'):
super().__init__(values)
self.forward = forward
self.backward = backward
self.middle = middle
def __setitem__(self, key, value):
# Do interpolation
super().__setitem__(key, value)
另一种选择是创建自己的类,与底层数据结构交互。此类不会从pd.Series继承,而是从对象继承。
class InterpolatedSeries(object):
def __init__(self, values, forward='nearest', backward='nearest', middle='linear'):
self.data = values
self.forward = forward
self.backward = backward
self.middle = middle
def __getitem__(self, key):
values = self.data.__getitem__(key)
# Do interpolation
return values
def __getattribute__(self, key): # maybe __getattr__ if this doesn't work
"""Return the stored pandas series item if the method or attribute was not found. This allows your to_csv method to work"""
try:
return super().__getattribute__(key)
except AttributeError:
pass
return self.data.__getattribute__(key) # Call the stored pandas series method if not found.
def __dir__(self):
"""Return the list of attributes. (Most code autocomplete features use this, so this will find your pandas series methods for autocomplete in IDEs). """
values = dir(self.data)
return values + super().__dir__()
以上可能不是最好的方法,但它确实增加了一些灵活性,使其更容易在后台访问pandas系列方法。