我想计算用户之间的相似度,但不能将值添加到空字典中。
这是我的代码:
data={'Lisa Rose': {'Lady in the Water': 2.5, 'Snakes on a Plane': 3.5,
'Just My Luck': 3.0, 'Superman Returns': 3.5, 'You, Me and Dupree': 2.5},
'Gene Seymour': {'Lady in the Water': 3.0, 'Snakes on a Plane': 3.5,
'Just My Luck': 1.5, 'The Night Listener': 3.0},
'Michael Phillips': {'Lady in the Water': 2.5, 'Snakes on a Plane': 3.0,
'Superman Returns': 3.5, 'The Night Listener': 4.0},
'Claudia Puig': {'Snakes on a Plane': 3.5, 'Just My Luck': 3.0,
'The Night Listener': 4.5, 'You, Me and Dupree': 2.5},
'Mick LaSalle': {'Just My Luck': 2.0, 'Lady in the Water': 3.0,'Superman
Returns': 3.0, 'The Night Listener': 3.0, 'You, Me and Dupree': 2.0},
'Jack Matthews': {'Snakes on a Plane': 4.0, 'The Night Listener': 3.0,
'Superman Returns': 5.0, 'You, Me and Dupree': 3.5},
'Toby': {'Snakes on a Plane':4.5,'You, Me and Dupree':1.0,'Superman
Returns':4.0}}
df = pd.DataFrame(data)`
def usersimilarity(df):
w = dict()
for u in df.keys():
for v in df.keys():
if u == v:
continue
w[u][v] =
len(set(df[u])&set(df[v]))/math.sqrt(len(df[u])*len(df[v])*1.0)
return w
答案 0 :(得分:1)
你想这样做吗?
def usersimilarity(df):
w = dict()
for u in df.keys():
for v in df.keys():
if u == v:
continue
if u not in w.keys():
w[u] = dict()
w[u][v] = len(set(train[u])&set(train[v]))/math.sqrt(len(train[u])*len(train[v])*1.0)
return w
执行w[u][v]
后[v]
正在访问不存在的内容。您必须先创建[u]
。