Question

最近我读了Toby Segaran写的 Collective Intelligence 。但我被困在了解书中的一些代码。

这是在suggest.py

中的一些代码

以下代码是从偏好词典中返回人物的最佳匹配，并使用加权平均值获得某人的推荐每个其他用户的排名

# Return the Pearson correlation coefficient for p1 and p2
def sim_person(prefs, p1, p2):
    # Get the list of shared_items
    si={}
    for item in prefs[p1]:
        if item in prefs[p2]:si[item]=1

    # Find the number of elements 
    n=len(si)

    # if they have no ratings in common, return 0
    if n==0: return 0

    # Add up all the preferences
    sum1 = sum([prefs[p1][it] for it in si])
    sum2 = sum([prefs[p2][it] for it in si])

    # Sum up the squares
    sum1Sq = sum([pow(prefs[p1][it],2) for it in si])
    sum2Sq = sum([pow(prefs[p2][it],2) for it in si])

    # Sum up the products
    pSum = sum([prefs[p1][it]*prefs[p2][it] for it in si])

    # Calculate Person score
    num = pSum - (sum1*sum2/n)
    den = sqrt((sum1Sq - pow(sum1,2)/n)*(sum2Sq - pow(sum2,2)/n))
    if den == 0: return 0

    r = num/den
    return r

# Returns the best matches for person from the prefs dictionary.
# Number of results and similarity function are optional params.
def topMatch(prefs, person, n=5, similarity=sim_person):
    scores = [(similarity(prefs, person, other), other) 
                        for other in prefs if other!=person]

    # Sort the list so the highest scores appear at the top
    scores.sort()
    scores.reverse()
    return scores[0:n]

# Gets recommendations for a person by using a weighted average
# of every other user's rankings 
def getRecommendations(prefs, person, similarity=sim_person):
    totals = {}
    simSums = {}
    for other in prefs:
        # don't compare me to myself
        if other == person: continue
        sim = similarity(prefs, person, other)

        # ignore scores of zero of lower
        if sim<=0: continue
        for item in prefs[other]:

            # only score movies I haven't seen yet
            if item not in prefs[person] or prefs[person][item]==0:
                # Similarity * Score
                totals.setdefault(item, 0)
                totals[item]+=prefs[other][item]*sim
                # Sum of similarities
                simSums.setdefault(item, 0)
                simSums[item]+=sim

        # Create the normalized list 
        rankings = [(total/simSums[item], item) for item, total in totals.items()]

        # Return the sorted list 
        rankings.sort()
        rankings.reverse()
        return rankings

我无法理解的第一个代码是：

scores = [(similarity(prefs, person, other), other) for other in prefs if other!=person]

这句话中的第二个是否意味着一个参数？我可以将此代码更改为：

scores = [(similarity(prefs, person, other) for other in prefs if other!=person]

我无法理解的第二个代码是：

rankings = [(total/simSums[item], item) for item, total in totals.items()]

Answer 1

看起来你正在构建元组。比较：

coordinates = (10, 2)

与

some_score = (similarity(prefs, person, other), other)

您正在创建一个2元素元组。第一个元素是similarity(prefs, person, other)，第二个元素是other。

当我阅读集体智慧时，我坚持使用一些代码

1 个答案: