该操作即将计算用户六度关系计数。
每个用户可能有零个或多个朋友,表结构如下:
+----------+---------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------+---------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| userId | int(11) | NO | MUL | NULL | |
| friendId | int(11) | NO | | NULL | |
+----------+---------+------+-----+---------+----------------+
现在我从数据库中获取所有关系记录,
并计算为Map[Long, Set[Long]]
,这是每个用户的id映射到用户的朋友ID集。
val friendMap = friends.groupBy(_.userId) map { group =>
group._1 -> group._2.map(_.friendId).toSet
}
然后计算每个用户的六度朋友数:
val sixDegreeFriendCountMap = friendMap map { m =>
val (userId, friendIds) = m
val twoDegree = friendIds.flatMap(id => friendMap.getOrElse(id, Set())) --
friendIds
val threeDegree = twoDegree.flatMap(id => friendMap.getOrElse(id, Set())) --
friendIds -- twoDegree
val fourDegree = threeDegree.flatMap(id => friendMap.getOrElse(id, Set())) --
friendIds -- twoDegree -- threeDegree
val fiveDegree = fourDegree.flatMap(id => friendMap.getOrElse(id, Set())) --
friendIds -- twoDegree -- threeDegree -- fourDegree
val sixDegree = fiveDegree.flatMap(id => friendMap.getOrElse(id, Set())) --
friendIds -- twoDegree -- threeDegree -- fourDegree -- fiveDegree
val all = friendIds ++ twoDegree ++ threeDegree ++ fourDegree ++ fiveDegree ++ sixDegree
userId -> all.size
}
我可以从sixDegreeFriendCountMap
获得结果,但问题是计算每个用户花费500毫秒,我有300,000个用户。
所以这个编程运行了40多个小时。
有关sixDegreeFriendCountMap
?