我想调用一个方法getRecommendations,它只是向特定用户挑选文件推荐。我使用了一本有效的书中的代码。但是我看到只有一个核心工作,我希望我的所有核心都能完成工作,因为这会更快。
这是方法。
def getRecommendations(prefs,person,similarity=sim_pearson):
print "working on recommendation"
totals={}
simSums={}
for other in prefs:
# don't compare me to myself
if other==person: continue
sim=similarity(prefs,person,other)
# ignore scores of zero or lower
if sim<=0: continue
for item in prefs[other]:
# only score movies I haven't seen yet
if item not in prefs[person] or prefs[person][item]==0:
# Similarity * Score
totals.setdefault(item,0)
totals[item]+=prefs[other][item]*sim
# Sum of similarities
simSums.setdefault(item,0)
simSums[item]+=sim
# Create the normalized list
rankings=[(total/simSums[item],item) for item,total in totals.items( )]
# Return the sorted list
rankings.sort( )
rankings.reverse( )
ranking_output = open("data/rankings/"+str(int(person))+".ranking.recommendations","wb")
pickle.dump(rankings,ranking_output)
return rankings
通过
调用for i in customerID:
print "working on ", int(i)
#Make this working with multiple CPU's
getRecommendations(pickle.load(open("data/critics.recommendations", "r")), int(i))
正如您所见,我尝试向每位客户提出建议。将在以后使用。
那我怎么能多处理这个方法呢?我不会通过阅读一些例子甚至是documentation
来得到它答案 0 :(得分:0)
你想要一些(大致的,未经测试的)像:
from multiprocessing import Pool
NUMBER_OF_PROCS = 5 # some number... not necessarily the number of cores due to I/O
pool = Pool(NUMBER_OF_PROCS)
for i in customerID:
pool.apply_async(getRecommendations, [i])
pool.close()
pool.join()
(假设你只将'i'传递给getRecommendations,因为pickle.load只应该执行一次)
答案 1 :(得分:0)
詹姆斯给出的答案是正确的。我只想补充一点,你需要通过
导入多处理模块from multiprocessing import Pool
并且,池(4)意味着您要创建4个“工作”进程,这些进程将并行工作以执行您的任务。