kwarg-splatting a numpy array

时间:2016-04-15 04:33:34

标签: python numpy kwargs structured-array

如何编写一个使其工作的包装器类?

def foo(a, b):
    print a

data = np.empty(20, dtype=[('a', np.float32), ('b', np.float32)])

data = my_magic_ndarray_subclass(data)

foo(**data[0])

更多背景资料:

我有一对像这样的函数我想要矢量化:

def start_the_work(some_arg):
    some_calculation = ...
    something_else = ...

    cost = some_calculation * something_else

    return cost, dict(
        some_calculation=some_calculation,
        some_other_calculation=some_other_calculation
    )

def finish_the_work(some_arg, some_calculation, some_other_calculation):
    ...

意图是使用一堆不同的参数调用start_the_work,然后完成最低成本项。两个函数都使用了很多相同的计算,因此使用字典和kwarg-splatting来传递这些结果:

def run():
    best, best_cost, continuation = min(
        ((some_arg,) + start_the_work(some_arg)
         for some_arg in [1, 2, 3, 4]),
        key=lambda t: t[1]  # cost
    )
    return finish_the_work(best, **continuation)

我可以将它们矢量化的一种方法如下:

def start_the_work(some_arg):
    some_calculation = ...
    something_else = ...

    cost = some_calculation * something_else

    continuation = np.empty(cost.shape, dtype=[
        ('some_calculation', np.float32),
        ('some_other_calculation', np.float32)
    ])
    continuation['some_calculation'] = some_calculation
    continuation['some_other_calculation'] = some_other_calculation

    return cost, continuation

但是,尽管看起来像一本字典,但continuation不能被淹没。

3 个答案:

答案 0 :(得分:2)

它可能不完全是您想要的,但将数组包装在pandas DataFrame中允许这样的内容:

import pandas as pd

def foo(a, b):
    print(a)

data = np.empty(20, dtype=[('a', np.float32), ('b', np.float32)])

data = pd.DataFrame(data).T

foo(**data[0])
# 0.0

请注意,数据框是转置的,因为pandas的主索引是列而不是行。

答案 1 :(得分:1)

您是否在考虑因为结构化数组的字段可以通过名称访问,它们可能会作为字典的项目传递?

In [26]: x=np.ones((3,),dtype='i,f,i')

In [27]: x
Out[27]: 
array([(1, 1.0, 1), (1, 1.0, 1), (1, 1.0, 1)], 
      dtype=[('f0', '<i4'), ('f1', '<f4'), ('f2', '<i4')])

In [28]: x['f0']
Out[28]: array([1, 1, 1])

将其转换为字典有效:

In [29]: dd={'f0':x['f0'], 'f1':x['f1'], 'f2':x['f2']}

In [30]: def foo(**kwargs):
    ...:     print kwargs
    ...:     

In [31]: foo(**dd)
{'f0': array([1, 1, 1]), 'f1': array([ 1.,  1.,  1.], dtype=float32), 'f2': array([1, 1, 1])}

In [32]: foo(**x)  # the array itself won't work
...
TypeError: foo() argument after ** must be a mapping, not numpy.ndarray 

或者使用词典理解:

In [34]: foo(**{name:x[name] for name in x.dtype.names})
{'f0': array([1, 1, 1]), 'f1': array([ 1.,  1.,  1.], dtype=float32), 'f2': array([1, 1, 1])}

**kwargs可能取决于具有.keys()方法的对象。数组没有。

结构化数组的元素是np.void

In [163]: a=np.array([(1,2),(3,4)],dtype='i,i')

In [164]: a[0]
Out[164]: (1, 2)

In [165]: type(a[0])
Out[165]: numpy.void

它有一个dtype和名字:

In [166]: a[0].dtype.names
Out[166]: ('f0', 'f1')

In [167]: [{k:b[k] for k in b.dtype.names} for b in a]
Out[167]: [{'f0': 1, 'f1': 2}, {'f0': 3, 'f1': 4}]

使用您的数组子类,view具有此keys

class spArray(np.ndarray):
    def keys(self):
       return self.dtype.names

In [171]: asp=a.view(spArray)

In [172]: asp
Out[172]: 
spArray([(1, 2), (3, 4)], 
      dtype=[('f0', '<i4'), ('f1', '<i4')])

In [173]: asp.keys()
Out[173]: ('f0', 'f1')

构建此类的其他方法不起作用(即直接调用) - 这是子类化ndarray的复杂性的一部分。

def foo(**kwargs):
    print kwargs  

In [175]: foo(**asp)
{'f0': spArray([1, 3]), 'f1': spArray([2, 4])}

In [176]: foo(**asp[0])
 ...
TypeError: foo() argument after ** must be a mapping, not numpy.void 

In [177]: foo(**asp[[0]])
{'f0': spArray([1]), 'f1': spArray([2])}

splatting数组,或从中提取的1个元素数组工作,但元素,在这种情况下np.void元素不起作用。它没有key方法。

我尝试将np.void子类化为数组;它接受定义。但我找不到创造这样一个对象的方法。

答案 2 :(得分:0)

这几乎有效:

class SplattableArray(np.ndarray):
    def keys(self):
        return self.dtype.names

data = np.empty(20, dtype=[('a', np.float32), ('b', np.float32)])
data_splat = data.view(SplattableArray)

def foo(a, b):
    return a*b

foo(**data_splat)  # works!
foo(**data_splat[0])  # doesn't work :(

如果我们愿意成为可怕的人,那么这就有效:

from forbiddenfruit import curse
import numpy as np

def keys(obj):
    return obj.dtype.names

curse(np.void, 'keys', keys)
curse(np.ndarray, 'keys', keys)

data = np.empty(10, dtype='i,i')
def foo(**kwargs):
    return kwargs

foo(**data[0])
foo(**data)