最近我读了Toby Segaran写的 Collective Intelligence 。但我被困在了解书中的一些代码。
这是在suggest.py
中的一些代码以下代码是从偏好词典中返回人物的最佳匹配,并使用加权平均值获得某人的推荐 每个其他用户的排名
# Return the Pearson correlation coefficient for p1 and p2
def sim_person(prefs, p1, p2):
# Get the list of shared_items
si={}
for item in prefs[p1]:
if item in prefs[p2]:si[item]=1
# Find the number of elements
n=len(si)
# if they have no ratings in common, return 0
if n==0: return 0
# Add up all the preferences
sum1 = sum([prefs[p1][it] for it in si])
sum2 = sum([prefs[p2][it] for it in si])
# Sum up the squares
sum1Sq = sum([pow(prefs[p1][it],2) for it in si])
sum2Sq = sum([pow(prefs[p2][it],2) for it in si])
# Sum up the products
pSum = sum([prefs[p1][it]*prefs[p2][it] for it in si])
# Calculate Person score
num = pSum - (sum1*sum2/n)
den = sqrt((sum1Sq - pow(sum1,2)/n)*(sum2Sq - pow(sum2,2)/n))
if den == 0: return 0
r = num/den
return r
# Returns the best matches for person from the prefs dictionary.
# Number of results and similarity function are optional params.
def topMatch(prefs, person, n=5, similarity=sim_person):
scores = [(similarity(prefs, person, other), other)
for other in prefs if other!=person]
# Sort the list so the highest scores appear at the top
scores.sort()
scores.reverse()
return scores[0:n]
# Gets recommendations for a person by using a weighted average
# of every other user's rankings
def getRecommendations(prefs, person, similarity=sim_person):
totals = {}
simSums = {}
for other in prefs:
# don't compare me to myself
if other == person: continue
sim = similarity(prefs, person, other)
# ignore scores of zero of lower
if sim<=0: continue
for item in prefs[other]:
# only score movies I haven't seen yet
if item not in prefs[person] or prefs[person][item]==0:
# Similarity * Score
totals.setdefault(item, 0)
totals[item]+=prefs[other][item]*sim
# Sum of similarities
simSums.setdefault(item, 0)
simSums[item]+=sim
# Create the normalized list
rankings = [(total/simSums[item], item) for item, total in totals.items()]
# Return the sorted list
rankings.sort()
rankings.reverse()
return rankings
我无法理解的第一个代码是:
scores = [(similarity(prefs, person, other), other) for other in prefs if other!=person]
这句话中的第二个是否意味着一个参数?我可以将此代码更改为:
scores = [(similarity(prefs, person, other) for other in prefs if other!=person]
我无法理解的第二个代码是:
rankings = [(total/simSums[item], item) for item, total in totals.items()]
答案 0 :(得分:1)
看起来你正在构建元组。比较:
coordinates = (10, 2)
与
some_score = (similarity(prefs, person, other), other)
您正在创建一个2元素元组。第一个元素是similarity(prefs, person, other)
,第二个元素是other
。