我正在使用一个花费太多时间来完成的函数,因为它需要大量输入并使用两个嵌套for循环。
功能代码:
def transform(self, X):
global brands
result=[]
for x in X:
index=0
count=0
for brand in brands:
all_matches= re.findall(re.escape(brand), x,flags=re.I)
count_all_match=len(all_matches)
if(count_all_match>count):
count=count_all_match
index=brands.index(brand)
result.append([index])
return np.array(result)
那么如何更改此函数的代码以便它使用多处理来优化运行时间?
答案 0 :(得分:0)
我没有在方法self
中看到使用transform
。所以我做了一个共同的功能。
import re
import numpy as np
from concurrent.futures import ProcessPoolExecutor
def transformer(x):
global brands
index = 0
count = 0
for brand in brands:
all_matches = re.findall(re.escape(brand), x, flags=re.I)
count_all_match = len(all_matches)
if count_all_match > count:
count = count_all_match
index = brands.index(brand)
return [index]
def transform(X):
with ProcessPoolExecutor() as executor:
result = executor.map(transformer, X)
return np.array(list(result))