以下是我用来查找推荐系统的算法
获取current_user
以下内容:current_user.followings
要获得current_user
关注者:current_user.followers
该算法位于This Paper,第5页。
为了从R推荐另一个用户,我使用以下公式评估R中的每个元素:
score(person) = (occurences(person)/R.count) * (followers(person)/followees(person) * retweets(person)/tweets(person).count)
分数越接近1,用户就越有可能对该人感兴趣。
我在算法的第一部分遇到问题:命名R中人物发生的计数(事件(人))。我有以下代码:
def candidates(user)
@following = user.following # the persons the user follows or S
@follower = [] #defining an empty array to put L in
@following.each do |follow|
@follower = @follower + follow.followers #populating the array
end
@followees = [] #defining an array to put T in
@follower.each do |ff|
@followees = @followees + ff.following #populating the array
end
@followees = @followees - @following #getting rid of the persons that the user is already following so T - S which gives us R
@rezultat = []
@sugested = @followees & @followees #removing the duplicates
@sugested.each do |gg| #for each user that he might want to follow
nr = 0
@followees.each do |ff|
if (ff.email == gg.email) then nr = nr + 1 #find out how many times a user makes an appearance in the reunion of the intervals
end
end
if(gg.following.count != 0) then
score = ( nr/@followees.count() ) * ( gg.followers.count / gg.following.count) #calculating score without taking into consideration retweets yet
else score = 0
end
end
end
end
现在我必须得分。我面临的问题是计算R中同一对象的出现次数。对象是用户模型对象,它包含以下字段:
我如何考虑计算它们,但我不确定它是否有效(加上它不是我喜欢的)。使用我想要计算事件的当前人的电子邮件解析整个数组,并在每次偶然发现该电子邮件时将数字加1(因为电子邮件是唯一的)。还有其他想法吗?
此外,在填充之后,我应该如何保持关系人物分数以便在分数后轻松排序,然后我可以让人物对象显示它们:D ?.
任何提示或代码表示赞赏!
答案 0 :(得分:1)
再多一点红宝石:
@following = user.following
@follower = @following.collect { |following| following.followers }
@followees = @follower.collect { |follower| follower.following }
@followees = @followees - @following
@score = {}
@followees.uniq.map { |suggested| @score[suggested] = @followees.count(suggested)}
@score.select! { |user,count| count>0 && user.following.count>0 }
@score.each do |sguser,count|
@score[sguser] =
(count/sguser.followees.count) *
(sguser.followers.count / sguser.following.count)
end
因此,您将获得哈希@score {suggested_user:score_value},您可以根据需要对其进行排序。
如果你的数据足够大,你可以更多地移动到SQL域(JOIN,GROUP),减少数组的大小和数量。或者甚至可以在数据库中直接执行此操作,而不是为您的应用程序提取任何值(AFAIK就可以)。