我想在用户表的“following_list”列中找到共同的项目:
+----+--------------------+-------------------------------------+
| id | name | following_list |
+----+--------------------+-------------------------------------+
| 9 | User 1 | 26,6,12,10,21,24,19,16 |
| 10 | User 2 | 21,24 |
| 12 | User 3 | 9,20,21,26,30 |
| 16 | User 4 | 6,52,9,10 |
| 19 | User 5 | 9,10,6,24 |
| 21 | User 6 | 9,10,6,12 |
| 24 | User 7 | 9,10,6 |
| 46 | User 8 | 45 |
| 52 | User 9 | 10,12,16,21,19,20,18,17,23,25,24,22 |
+----+--------------------+-------------------------------------+
我希望能够按给定用户ID的匹配数进行排序。例如,我希望将除#9之外的所有用户与#9匹配,以查看他们共有的“following_list”列中的哪些ID。
我通过“SET”数据类型和一些技巧找到了这样做的方法:
http://dev.mysql.com/tech-resources/articles/mysql-set-datatype.html#bits
但是,我需要在任意ID列表上执行此操作。我希望这可以完全通过数据库完成,但这有点超出我的联盟。
SELECT a.following_id, COUNT( c.following_id ) AS matches
FROM following a
LEFT JOIN following b ON b.user_id = a.following_id
LEFT JOIN following c ON c.user_id = a.user_id
AND c.following_id = b.following_id
WHERE a.user_id = ?
GROUP BY a.following_id
现在我必须说服自己不要过早地进行优化。
答案 0 :(得分:2)
如果您将following_list列标准化为带有user_id和follower_id的单独表格,那么您会发现COUNT()非常易于使用。 您还可以找到选择关注者列表的逻辑,或者更容易遵循用户列表
答案 1 :(得分:2)
如果您可以将following_list
列拆分为子表,则会简化您的问题,例如
TABLE id_following_list:
id | following
--------------
10 | 21
10 | 24
46 | 45
...| ...
您可以阅读更多here。
答案 2 :(得分:1)
规范化表格,删除列following_list
,创建表格following
:
user_id
following_id
这导致了轻松的查询(未经测试,你明白了):
SELECT b.user_id, COUNT(c.following)
FROM following a
JOIN following b -- get followings of <id>
ON b.following_id = a.following_id
AND b.user_id = a.following_id
JOIN following c -- get all (other) followings of <id> again, match with followings of b
ON b.following_id = c.following_id
AND c.user_id = a.user_id
WHERE a.user_id = <id>
GROUP BY b.user_id
ORDER BY COUNT(b.following) DESC
性能可能非常基于索引&amp;数据集的大小,可能会添加一个“相似性”列,该列会定期更新或仅为快速数据检索而更改。