我有一个执行大量计算的Python类。该类支持各种计算,每个计算可能会或可能不会被实际调用。这是一个例子:
class MyCalc(object):
def __init__(user, query_date, award):
self.user = user
self.query_date = query_date
self.award = award
def balance(self): # this can be subtracted
return self.award.balance
def value(self): # this can be subtracted
if self.user.award_date > self.query_date:
return self.award.value * self.user.multiplier
return 0
def has_multiple_awards(self): # this can not be subtracted
return self.user.awards > 2
def as_pandas_series(self):
return pd.Series({'balance': self.balance(),
'value': self.value(),
'query_date': self.query_date,
'award': self.award,
'user': self.user})
我想要的是计算两个类实例之间的差异。我已经提出了以下方法,但不确定这种方法是否有任何缺点或者可能有更好的方法?
class Diff(object):
def __init__(self, a, b):
self.a = a
self.b = b
def __getattr__(self, attr):
getter = operator.attrgetter(attr)
closing = getter(self.a)()
opening = getter(self.b)()
return closing - opening
a = MyCalc()
b = MyCalc()
diff = Diff(a, b)
print(diff.calc_x) # calculate a.calc_x() - b.calc_x()
或者我可以添加装饰器而不使用Diff类:
def differance(func):
def func_wrapper(self):
return func(self) - func(self.b)
return func_wrapper
class MyCalc(object):
@difference
def calc_x(self):
return some_calc
@difference
def calc_y(self):
return some_calc
我们将不胜感激。
答案 0 :(得分:1)
import operator
class MyCalc(object):
def __init__(self, x=0, y=0, *args):
self.x = x
self.y = y
def calc_x(self):
return self.x * 2
def calc_y(self): # There's about 15 of these calculations
return self.y / 2
class Diff(object):
def __init__(self, a, b):
self.a = a
self.b = b
def _diff(self, func, *args):
getter = operator.attrgetter(func)
closing = getter(self.a)()
opening = getter(self.b)()
return closing - opening
a = MyCalc(50)
b = MyCalc(100)
diff = Diff(a, b)
ret = diff._diff("calc_x")
print ret
>>> -100
答案 1 :(得分:1)
你的Diff
课对我来说很好,但我还是没有决定这是否是Pythonic。 ;)我没有看到任何重大缺点,但可以提高效率。
这是Diff
类的替代实现。它的效率更高一些,因为它不必在每个operator.attrgetter
调用中进行查找和两次__getattr__
调用。相反,它使用functools.partial
和内置getattr
函数来缓存访问函数的属性。
为了测试目的,我还实现了一个简单的MyCalc
类。
from functools import partial
class MyCalc(object):
def __init__(self, u, v):
self.u = u
self.v = v
def calc_x(self):
return self.u + self.v
def calc_y(self):
return self.u * self.v
class Diff(object):
def __init__(self, a, b):
self.geta = partial(getattr, a)
self.getb = partial(getattr, b)
def __getattr__(self, attr):
closing = self.geta(attr)()
opening = self.getb(attr)()
return closing - opening
a = MyCalc(10, 20)
b = MyCalc(2, 3)
diff = Diff(a, b)
print(diff.calc_x)
print(diff.calc_y)
a.u, a.v = 30, 40
b.u, b.v = 4, 7
print(diff.calc_x)
print(diff.calc_y)
<强>输出强>
25
194
59
1172
答案 2 :(得分:1)
你说你的类支持大约15个计算,都返回数值,其中一些可能被调用,也可能不被调用。
最干净和最Pythonic似乎是有一个calc()
方法返回一个向量,即NumPy数组(或Pandas系列或DataFrame)。然后客户端代码可以简单地进行向量减法:ab_diff = a.calc() - b.calc()
。似乎没有必要根据你所描述的内容在np.array上重新发明轮子。
如果某些计算很少被调用和/或计算成本很高,那么您可以重构为calc()
和calc_rare()
。或者,您可以将kwargs传递给calc(..., compute_latlong=False, compute_expensive_stuff=False)
。您可以返回np.NaN
默认值作为默认情况下不计算的昂贵内容,以保持矢量长度不变。
import numpy as np
#import pandas as pd
class MyCalc(object):
def __init__(self, ...): ...
# (You can either have 15 calculation methods, or use properties.
# It depends on whether any of these quantities are interrelated
# or have shared dependencies, especially expensive ones.)
def calc_q(self): ...
def calc_r(self): ...
def calc_s(self): ...
...
def calc_y(self): ...
def calc_z(self): ...
# One main calc() method for the client. (You might hide the
# other calc_* methods as _calc_*, or else in properties.)
def calc(self):
return np.array([ calc_q(), calc_r(), calc_s(),
... calc_y(), calc_z() ]) # Refactor this as you see fit
if __name__ == '__main__':
# Client is as simple as this
a = MyCalc(...)
b = MyCalc(...)
ab_diff = a.calc() - b.calc()