Question

我正在尝试在numexpr表达式中使用对象属性。最明显的方法：

import numpy as np
import numexpr as ne

class MyClass:
    def __init__(self):
        self.a = np.zeros(10)

o = MyClass()

o.a

b = ne.evaluate("o.a+1")

导致以下错误

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-22-dc90c81859f1> in <module>()
     10 o.a
     11 
---> 12 b = ne.evaluate("o.a+1")

~/.local/lib/python3.5/site-packages/numexpr/necompiler.py in evaluate(ex, local_dict, global_dict, out, order, casting, **kwargs)
    799     expr_key = (ex, tuple(sorted(context.items())))
    800     if expr_key not in _names_cache:
--> 801         _names_cache[expr_key] = getExprNames(ex, context)
    802     names, ex_uses_vml = _names_cache[expr_key]
    803     arguments = getArguments(names, local_dict, global_dict)

~/.local/lib/python3.5/site-packages/numexpr/necompiler.py in getExprNames(text, context)
    706 
    707 def getExprNames(text, context):
--> 708     ex = stringToExpression(text, {}, context)
    709     ast = expressionToAST(ex)
    710     input_order = getInputOrder(ast, None)

~/.local/lib/python3.5/site-packages/numexpr/necompiler.py in stringToExpression(s, types, context)
    296         names.update(expressions.functions)
    297         # now build the expression
--> 298         ex = eval(c, names)
    299         if expressions.isConstant(ex):
    300             ex = expressions.ConstantNode(ex, expressions.getKind(ex))

<expr> in <module>()

AttributeError: 'VariableNode' object has no attribute 'a'

咨询another question，通过使用numexpr的{{1}}，我得到了一个不太令人满意的解决方案：

global_dict

一旦import numpy as np import numexpr as ne class MyClass: def __init__(self): self.a = np.zeros(10) o = MyClass() o.a b = ne.evaluate("a+1", global_dict={'a':o.a})具有十几个属性，并且对MyClass进行了一些这样的调用，那将变得非常混乱。

是否有一种简单，干净的方法？

Answer 1

如果您的对象开始具有很多属性，那么您主要关心的是evaluate调用的可伸缩性/可维护性。您可以通过传递vars(o)来自动化此部分：

import numpy as np
import numexpr as ne

class MyClass:
    def __init__(self):
        self.a = np.arange(10000)
        self.b = 2*self.a

o = MyClass()

c = ne.evaluate("a+b", local_dict=vars(o))

请注意，我使用local_dict是因为将这些名称放入本地名称空间可能会稍快一些。如果实例属性有可能与脚本中的本地名称冲突（这在很大程度上取决于您如何命名属性以及类的作用），那么将vars作为global_dict传递可能更安全。就像在问题中一样（并且出于同样的原因as noted in a comment）。

您仍然必须在numexpr表达式中跟踪实例属性及其名称之间的对应关系，但是使用上述内容可以跳过大部分工作。

Answer 2

您可以使用对象的>>> df.registerTempTable("df") >>> spark.sql("select sum(previous) as previous_total, sum(current) as current_total from df").show()属性来执行此操作。这将返回一个字典，其中的键是属性的名称（作为字符串），值是该属性本身的实际值。

举例来说，您问题中的代码如下：

__dict__

但是，某些对象可能没有import numpy as np import numexpr as ne class MyClass: def __init__(self): self.a = np.zeros(10) o = MyClass() o.a b = ne.evaluate("a+1", global_dict=o.__dict__) # Notice the .__dict__属性。因此，我做了一个小功能，可以做同样的事情：

__dict__

请注意，此函数还将包括方法和某些隐藏的属性，例如def asdict(obj): objDict = {} for attr in dir(g): objDict[attr] = getattr(g, attr) return objDict和__module__

在numexpr表达式中使用对象属性

2 个答案: