我有以下代码,需要多次删除。目前,它花费的时间太长。有没有一种有效的方式来编写这两个for循环。
ErrorEst=[]
for i in range(len(embedingFea)):#17000
temp=[]
for j in range(len(emedingEnt)):#15000
if cooccurrenceCount[i][j]>0:
#print(coaccuranceCount[i][j]/ count_max)
weighting_factor = np.min(
[1.0,
math.pow(np.float32(cooccurrenceCount[i][j]/ count_max), scaling_factor)])
embedding_product = (np.multiply(emedingEnt[j], embedingFea[i]), 1)
#tf.log(tf.to_float(self.__cooccurrence_count))
log_cooccurrences =np.log (np.float32(cooccurrenceCount[i][j]))
distance_expr = np.square(([
embedding_product+
focal_bias[i],
context_bias[j],
-(log_cooccurrences)]))
single_losses =(weighting_factor* distance_expr)
temp.append(single_losses)
ErrorEst.append(np.sum(temp))
答案 0 :(得分:0)
如果需要提高代码的性能,则应使用C之类的低级语言编写,并尽量避免使用浮点数。
可能的解决方案:Can we use C code in Python?
答案 1 :(得分:0)
您可以尝试使用numba,并使用@jit
装饰器包装代码。通常,第一次执行需要编译一些内容,因此不会看到很多加速,但是后续迭代会更快。
您可能需要将循环放入一个函数中才能起作用。
from numba import jit
@jit(nopython=True)
def my_double_loop(some, arguments):
for i in range(len(embedingFea)):#17000
temp=[]
for j in range(len(emedingEnt)):#15000
# ...
答案 2 :(得分:0)
首先,请确保尽可能避免列出列表,并像在C语言中那样使用显式循环编写简单易读的代码。所有输入和输出都是numpy数组或标量。
您的代码
import numpy as np
import numba as nb
import math
def your_func(embedingFea,emedingEnt,cooccurrenceCount,count_max,scaling_factor,focal_bias,context_bias):
ErrorEst=[]
for i in range(len(embedingFea)):#17000
temp=[]
for j in range(len(emedingEnt)):#15000
if cooccurrenceCount[i][j]>0:
weighting_factor = np.min([1.0,math.pow(np.float32(cooccurrenceCount[i][j]/ count_max), scaling_factor)])
embedding_product = (np.multiply(emedingEnt[j], embedingFea[i]), 1)
log_cooccurrences =np.log (np.float32(cooccurrenceCount[i][j]))
distance_expr = np.square(([embedding_product+focal_bias[i],context_bias[j],-(log_cooccurrences)]))
single_losses =(weighting_factor* distance_expr)
temp.append(single_losses)
ErrorEst.append(np.sum(temp))
return ErrorEst
Numba代码
@nb.njit(fastmath=True,error_model="numpy",parallel=True)
def your_func_2(embedingFea,emedingEnt,cooccurrenceCount,count_max,scaling_factor,focal_bias,context_bias):
ErrorEst=np.empty((embedingFea.shape[0],2))
for i in nb.prange(embedingFea.shape[0]):
temp_1=0.
temp_2=0.
for j in range(emedingEnt.shape[0]):
if cooccurrenceCount[i,j]>0:
weighting_factor=(cooccurrenceCount[i,j]/ count_max)**scaling_factor
if weighting_factor>1.:
weighting_factor=1.
embedding_product = emedingEnt[j]*embedingFea[i]
log_cooccurrences =np.log(cooccurrenceCount[i,j])
temp_1+=weighting_factor*(embedding_product+focal_bias[i])**2
temp_1+=weighting_factor*(context_bias[j])**2
temp_1+=weighting_factor*(log_cooccurrences)**2
temp_2+=weighting_factor*(1.+focal_bias[i])**2
temp_2+=weighting_factor*(context_bias[j])**2
temp_2+=weighting_factor*(log_cooccurrences)**2
ErrorEst[i,0]=temp_1
ErrorEst[i,1]=temp_2
return ErrorEst
时间
embedingFea=np.random.rand(1700)+1
emedingEnt=np.random.rand(1500)+1
cooccurrenceCount=np.random.rand(1700,1500)+1
focal_bias=np.random.rand(1700)
context_bias=np.random.rand(1500)
count_max=100
scaling_factor=2.5
%timeit res_1=your_func(embedingFea,emedingEnt,cooccurrenceCount,count_max,scaling_factor,focal_bias,context_bias)
1min 1s ± 346 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit res_2=your_func_2(embedingFea,emedingEnt,cooccurrenceCount,count_max,scaling_factor,focal_bias,context_bias)
17.6 ms ± 2.81 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)