我有一个numpy数据结构如下:
[[['diaad'],
['iaadf'],
['aadfe'],
['hedbb'],
['edbbb'],
['dbbbb']],
[['gegec'],
['ehecf'],
['gecfc'],
['gadff'],
['adfef'],
['dffgc']],
[['ddddj'],
['dddjd'],
['ddjdd'],
['jfffd'],
['fgfdb'],
['ggdbb']]]
实例化如下:
>>> a = np.array([[['diaad'], ['iaadf'], ['aadfe'], ['hedbb'], ['edbbb'], ['dbbbb']], [['gegec'], ['ehecf'], ['gecfc'], ['gadff'], ['adfef'], ['dffgc']], [['ddddj'], ['dddjd'], ['ddjdd'], ['jfffd'], ['fgfdb'], ['ggdbb']]])
是否有直接numpy
方式计算自定义函数而不是成对元素?
例如,我的自定义函数称为processPair(a,b)
。它应计算沿列的所有成对元素的结果,即('diaad', 'gegec')
,('gegec', 'ddddj')
和('diaad', 'ddddj')
之间的结果。这样做有什么建议吗?我认为map
函数可以实现这一点,但我不完全确定如何。
答案 0 :(得分:1)
这是我的解决方案。我并不十分满意 - 我觉得应该可以更优雅地做到这一点 - 但它确实有效:
from itertools import combinations
def apply_pairwise(func, a):
"For each row, call func with every possible combination of two values"
stack = []
for col_a, col_b in combinations(range(a.shape[0]), 2):
stack.append(np.hstack([a[col_a], a[col_b]]))
combined = np.vstack(stack)
def unpack_row(row):
"Calls func with the values of a given numpy array as arguments"
return func(*row.tolist())
return np.apply_along_axis(unpack_row, 1, combined)
像这样使用(假设已经定义了示例数组a
):
>>> f = lambda x, y: x + y
>>> print apply_pairwise(f, a)
['diaadgegec' 'iaadfehecf' 'aadfegecfc' 'hedbbgadff' 'edbbbadfef'
'dbbbbdffgc' 'diaadddddj' 'iaadfdddjd' 'aadfeddjdd' 'hedbbjfffd'
'edbbbfgfdb' 'dbbbbggdbb' 'gegecddddj' 'ehecfdddjd' 'gecfcddjdd'
'gadffjfffd' 'adfeffgfdb' 'dffgcggdbb']