如何使用Python多处理并行化这个?

时间:2018-05-19 16:59:40

标签: python parallel-processing large-data multiclass-classification

我在Python中编写了一个1 vs rest分类器,用于训练11个不同的分类器,每个分类一个。代码如下所示:

def onevsrest(X_train,y_train,lamb):
    beta=[]
    beta_init=np.zeros(X_train.shape[1])
    for i in range(1,12):
        print(i)
        y=np.copy(y_train)
        y[y != i] = -1
        y[y == i] = 1
        beta_temp,objs = svm(lamb, 0.1, 200, X_train, y)
        beta.append(beta_temp[-1])
    return beta

如何使用Python多处理模块并行化上述程序?根据我的理解,多处理只能用于具有单个参数的代码。我如何将这个扩展到这个带有多个参数的函数?

2 个答案:

答案 0 :(得分:1)

你可以"部分"你的功能。例如:

# multiple arguments function
def calc(a, b, c):
    return a + b + c

# prepare a single argument partial function, freezing `b` and `c`
from functools import partial
calc2 = partial(calc, b=3, c=7)

from multiprocessing import Pool
p = Pool(5)
print(p.map(calc2, [1, 2, 3, 4, 5, 6]))

答案 1 :(得分:0)

你也可以使用元组,例如:

from multiprocessing import Pool

def f(data):
    x, y = data
    return x*y

if __name__ == '__main__':
    with Pool(5) as p:
        X = (1,2,3)
        Y = (1,4,9)
        print(p.map( f, list(zip(X,Y)) )) # returns [1, 8, 27]