如何一次应用一列功能?

时间:2012-06-28 07:55:58

标签: python numpy

我有一个numpy数据结构如下:

[[['diaad'],
  ['iaadf'],
  ['aadfe'],
  ['hedbb'],
  ['edbbb'],
  ['dbbbb']],

 [['gegec'],
  ['ehecf'],
  ['gecfc'],
  ['gadff'],
  ['adfef'],
  ['dffgc']],

 [['ddddj'],
  ['dddjd'],
  ['ddjdd'],
  ['jfffd'],
  ['fgfdb'],
  ['ggdbb']]]

实例化如下:

>>> a = np.array([[['diaad'], ['iaadf'],  ['aadfe'],  ['hedbb'],  ['edbbb'],  ['dbbbb']], [['gegec'],  ['ehecf'],  ['gecfc'],  ['gadff'],  ['adfef'],  ['dffgc']], [['ddddj'],  ['dddjd'],  ['ddjdd'],  ['jfffd'],  ['fgfdb'],  ['ggdbb']]])

是否有直接numpy方式计算自定义函数而不是成对元素?

例如,我的自定义函数称为processPair(a,b)。它应计算沿列的所有成对元素的结果,即('diaad', 'gegec')('gegec', 'ddddj')('diaad', 'ddddj')之间的结果。这样做有什么建议吗?我认为map函数可以实现这一点,但我不完全确定如何。

1 个答案:

答案 0 :(得分:1)

这是我的解决方案。我并不十分满意 - 我觉得应该可以更优雅地做到这一点 - 但它确实有效:

from itertools import combinations

def apply_pairwise(func, a):
    "For each row, call func with every possible combination of two values"

    stack = []
    for col_a, col_b in combinations(range(a.shape[0]), 2):
        stack.append(np.hstack([a[col_a], a[col_b]]))

    combined = np.vstack(stack)

    def unpack_row(row):
        "Calls func with the values of a given numpy array as arguments"
        return func(*row.tolist())

    return np.apply_along_axis(unpack_row, 1, combined)

像这样使用(假设已经定义了示例数组a):

>>> f = lambda x, y: x + y
>>> print apply_pairwise(f, a)
['diaadgegec' 'iaadfehecf' 'aadfegecfc' 'hedbbgadff' 'edbbbadfef'
'dbbbbdffgc' 'diaadddddj' 'iaadfdddjd' 'aadfeddjdd' 'hedbbjfffd'
'edbbbfgfdb' 'dbbbbggdbb' 'gegecddddj' 'ehecfdddjd' 'gecfcddjdd'
'gadffjfffd' 'adfeffgfdb' 'dffgcggdbb']