Question

我有一个numpy数组，它只有一些非零条目，可以是正数也可以是负数。例如。像这样的东西：

myArray = np.array([[ 0.        ,  0.        ,  0.        ],
       [ 0.32, -6.79,  0.        ],
       [ 0.        ,  0.        ,  0.        ],
       [ 0.        ,  1.5        ,  0.        ],
       [ 0.        ,  0.        , -1.71]])

最后，我希望收到一个列表，其中此列表的每个条目对应一行myArray，并且是函数输出的累积乘积，它取决于myArray和另一个列表的相应行的条目（in下面的例子叫做l）。个别条款取决于myArray条目的符号：当它为正数时，我会应用＆＃34; funPos＆＃34;，当它为负数时，我会应用＆＃34; funNeg＆＃34;如果条目为0，则该项将为1.因此，在上面的示例数组中，它将是：

output = [1*1*1 , 
         funPos(0.32, l[0])*funNeg(-6.79,l[1])*1, 
         1*1*1, 
         1*funPos(1.5, l[1])*1, 
         1*1*funNeg(-1.71, l[2])]

我实现了如下所示，它给了我想要的输出（注意：这只是一个高度简化的玩具示例;实际矩阵更大，功能更复杂）。我遍历数组的每一行，如果行的总和为0，我不必进行任何计算，输出只是1.如果它不等于0，我会通过这一行，检查每个值的符号并应用适当的函数。

import numpy as np
def doCalcOnArray(Array1, myList):

    output = np.ones(Array1.shape[0]) #initialize output

    for indRow,row in enumerate(Array1):

    if sum(row) != 0: #only then calculations are needed
        tempProd = 1. #initialize the product that corresponds to the row
        for indCol, valCol in enumerate(row):

        if valCol > 0:
            tempVal = funPos(valCol, myList[indCol])

        elif valCol < 0:
            tempVal = funNeg(valCol, myList[indCol])

        elif valCol == 0:
            tempVal = 1

        tempProd = tempProd*tempVal

        output[indRow] = tempProd

    return output 

def funPos(val1,val2):
    return val1*val2

def funNeg(val1,val2):
    return val1*(val2+1)

myArray = np.array([[ 0.        ,  0.        ,  0.        ],
       [ 0.32, -6.79,  0.        ],
       [ 0.        ,  0.        ,  0.        ],
       [ 0.        ,  1.5        ,  0.        ],
       [ 0.        ,  0.        , -1.71]])     

l = [1.1, 2., 3.4]

op = doCalcOnArray(myArray,l)
print op

输出

[ 1.      -7.17024  1.       3.      -7.524  ]

这是所需的我的问题是，是否有更有效的方法可以做到这一点，因为这是非常昂贵的＃34;对于大型阵列。

编辑：我接受了gabhijit的回答，因为他提出的纯粹的numpy解决方案似乎是我正在处理的阵列中最快的解决方案。请注意，RaJa还有一个很好的工作解决方案，需要熊猫，而dave的解决方案也很好，可以作为如何使用发电机和numpy＆＃34; apply_along_axis＆＃34;的一个很好的例子。

Answer 1

所以，让我们看看我是否理解你的问题。

您希望将矩阵的元素映射到新矩阵，以便：
- 0映射到1
- x>0映射到funPos(x)
- x<0映射到funNeg(x)
您想要计算此新矩阵行中所有元素的乘积。

所以，这就是我要做的事情：

1：

def myFun(a):
    if a==0:
        return 1
    if a>0:
        return funPos(a)
    if a<0:
        return funNeg(a)

newFun = np.vectorize(myFun)
newArray = newFun(myArray)

对于2：

np.prod(newArray, axis = 1)

编辑：要将索引传递给funPos，funNeg，您可以执行以下操作：

# Python 2.7
r,c = myArray.shape
ctr = -1       # I don't understand why this should be -1 instead of 0
def myFun(a):
    global ctr
    global c
    ind = ctr % c
    ctr += 1
    if a==0:
        return 1
    if a>0:
        return funPos(a,l[ind])
    if a<0:
        return funNeg(a,l[ind])

Answer 2

我已尝试使用numpy数组的屏蔽功能。但是，我无法通过funPos或funNeg找到替换数组中值的解决方案。

所以我的建议是尝试使用pandas，因为它会在屏蔽时保存索引。

参见我的例子：

import numpy as np
import pandas as pd

def funPos(a, b):
    return a * b
def funNeg(a, b):
    return a * (b + 1)

myPosFunc = np.vectorize(funPos) #vectorized form of funPos
myNegFunc = np.vectorize(funNeg) #vectorized form of funNeg

#Input
I = [1.0, 2.0, 3.0]    
x = pd.DataFrame([
    [ 0.,0.,0.],
    [ 0.32, -6.79,  0.],
    [ 0.,0.,0.],
    [ 0.,1.5,0.],
    [ 0.,0., -1.71]])

b = pd.DataFrame(myPosFunc(x[x>0], I)) #calculate all positive values
c = pd.DataFrame(myNegFunc(x[x<0], I)) #calculate all negative values   
b = b.combineMult(c) #put values of c in b
b = b.fillna(1) #replace all missing values that were '0' in the raw array
y = b.product() #multiply all elements in one row

#Output
print ('final result')
print (y)
print (y.tolist())

Answer 3

我认为这个numpy功能会对你有所帮助

numpy.apply_along_axis

这是一个实现。另外，我会警告不要检查数组的总和是否为0.将浮点数比较为0会因机器精度限制而产生意外行为。此外，如果你有-5和5，总和为零，我不确定你想要什么。我使用numpy的any（）函数来查看是否有任何非零值。为简单起见，我还将您的列表（my_list）提取到全局范围。

import numpy as np


my_list = 1.1, 2., 3.4

def func_pos(val1, val2):
    return val1 * val2

def func_neg(val1, val2):
    return val1 *(val2 + 1)


def my_generator(row):
    for i, a in enumerate(row):
        if a > 0:
            yield func_pos(a, my_list[i])
        elif a < 0:
            yield func_neg(a, my_list[i])
        else:
            yield 1


def reduce_row(row):
    if not row.any():
        return 1.0
    else:
        return np.prod(np.fromiter(my_generator(row), dtype=float))


def main():
    myArray = np.array([
            [ 0.        ,  0.        ,  0.        ],
            [ 0.32, -6.79,  0.        ],
            [ 0.        ,  0.        ,  0.        ],
            [ 0.        ,  1.5        ,  0.        ],
            [ 0.        ,  0.        , -1.71]])
    return np.apply_along_axis(reduce_row, axis=1, arr=myArray)

可能有更快的实施，我认为apply_along_axis实际上只是一个循环。

我没有测试过，但我敢打赌，这比你的开始更快，应该更有效。

Answer 4

这是我尝试过的 - 使用reduce，map。我不确定这有多快 - 但这是你想要做的吗？

编辑4：最简单，最易读 - 让l成为一个numpy数组，然后大大简化where。

import numpy as np
import time

l = np.array([1.0, 2.0, 3.0])

def posFunc(x,y):
    return x*y

def negFunc(x,y):
    return x*(y+1)

def myFunc(x, y):
    if x > 0:
        return posFunc(x, y)
    if x < 0:
        return negFunc(x, y)
    else:
        return 1.0

myArray = np.array([
        [ 0.,0.,0.],
        [ 0.32, -6.79,  0.],
        [ 0.,0.,0.],
        [ 0.,1.5,0.],
        [ 0.,0., -1.71]])

t1 = time.time()
a = np.array([reduce(lambda x, (y,z): x*myFunc(z,l[y]), enumerate(x), 1) for x in myArray])
t2 = time.time()
print (t2-t1)*1000000
print a

基本上，我们只看最后一行，它说累计乘以enumerate(xx)中的内容，从1开始（最后一个参数为reduce）。 myFunc只接受l中myArray（row）和element @ index行中的元素，并根据需要将它们相乘。

我的输出与你的输出不一样 - 所以我不确定这是否正是你想要的，但可能你可以遵循逻辑。

此外，我不太确定这对于大型阵列有多快。

编辑：以下是一种纯粹的numpy方式＆＃39;这样做。

my = myArray # just for brevity

t1 = time.time() 
# First set the positive and negative values
# complicated - [my.itemset((x,y), posFunc(my.item(x,y), l[y])) for (x,y) in zip(*np.where(my > 0))]
# changed to 
my = np.where(my > 0, my*l, my)
# complicated - [my.itemset((x,y), negFunc(my.item(x,y), l[y])) for (x,y) in zip(*np.where(my < 0))]
# changed to 
my = np.where(my < 0, my*(l+1), my)
# print my - commented out to time it.

# Now set the zeroes to 1.0s
my = np.where(my == 0.0, 1.0, my)
# print my  - commented out to time it

a = np.prod(my, axis=1)
t2 = time.time()
print (t2-t1)*1000000

print a

让我尽力解释zip(*np.where(my != 0))部分。 np.where只返回两个numpy数组，第一个数组是行的索引，第二个数组是在这种情况下匹配条件(my != 0)的列的索引。我们采用这些索引的元组，然后使用array.itemset和array.item，幸运的是，列索引可以免费提供给我们，所以我们可以在列表中{@ 1}获取元素@该索引}。这应该比以前更快（并且数量级可读!!）。需要l来确定它是否确实存在。

编辑2：不必单独拨打正面和负面电话，只需拨打一个电话timeit。

将多个函数应用于数组的每一行

4 个答案: