我有一个足球游泳池网站。每周,我的朋友都会挑选每场比赛的获胜者。我想比较每个球员与其他球员的选秀权并列出相似的百分比。我发现这个页面帮助我计算了特定周的相似度:Compare group of tags to find similarity/score with PHP/MySQL。感谢Ivar Bonsaksen,他的解决方案很有效!
我现在要做的是显示过去几周每位玩家的累积相似度。
我有3个表要查询:个人资料(spprofiles),游戏(sp6games)和精选(sp6picks)。另一个名为Teams(sp6teams)的表用于获取团队的名称,但这里无关紧要。
Profiles (spprofiles)
+-----------+-------------+
| profileID | profilename |
+-----------+-------------+
| 52 | My Team A |
| 53 | Some Team B |
+-----------+-------------+
Games (sp6games)
+--------+--------+---------+------+
| gameID | weekID | visitor | home |
+--------+--------+---------+------+
| 1 | 2 | 9 | 21 |
| 2 | 2 | 14 | 6 |
| 17 | 3 | 6 | 9 |
| 18 | 3 | 30 | 21 |
+--------+--------+---------+------+
Picks (sp6picks)
+-----------+--------+------+
| profileID | gameID | pick |
+-----------+--------+------+
| 52 | 1 | 21 |
| 52 | 2 | 6 |
| 52 | 17 | 12 |
| 52 | 18 | 21 |
| 53 | 1 | 9 |
| 53 | 2 | 6 |
| 53 | 17 | 9 |
| 53 | 18 | 21 |
+-----------+--------+------+
本周的查询如下所示:
$weekID = 3; //the current weekID
$profile = 52; //the current ProfileID
SELECT
targetProfiles.profileID AS targetID,
sourceProfiles.profileID AS sourceID,
COUNT(targetProfiles.profileID)
/
(((SELECT COUNT(*) FROM sp6Picks LEFT JOIN sp6Games USING (gameID) WHERE profileID = sourceProfiles.profileID AND weekID = $weekID)
+
(SELECT COUNT(*) FROM sp6Picks LEFT JOIN sp6Games USING (gameID) WHERE profileID = targetProfiles.profileID AND weekID = $weekID))/2)
AS similarity
FROM
spProfiles AS sourceProfiles
LEFT JOIN
(SELECT sp6Picks.* FROM sp6Picks LEFT JOIN sp6Games USING (gameID) WHERE weekID = $weekID) AS sourcePicks
ON (sourcePicks.profileID = sourceProfiles.profileID)
INNER JOIN
(SELECT sp6Picks.* FROM sp6Picks LEFT JOIN sp6Games USING (gameID) WHERE weekID = $weekID) AS targetPicks
ON (sourcePicks.pick = targetPicks.pick AND sourcePicks.profileID != targetPicks.profileID)
LEFT JOIN
spProfiles AS targetProfiles
ON (targetPicks.profileID = targetProfiles.profileID)
WHERE sourceProfiles.profileID = $profile
GROUP BY targetID
如果我分别在几周内运行此查询,我会得到以下结果:
$weekID = 2;
+----------+----------+------------+
| targetID | sourceID | similarity |
+----------+----------+------------+
| 53 | 52 | 0.5000 |
+----------+----------+------------+
$weekID = 3;
+----------+----------+------------+
| targetID | sourceID | similarity |
+----------+----------+------------+
| 53 | 52 | 0.5000 |
+----------+----------+------------+
到目前为止我已经计算出的累积看起来像这样的查询(但我尝试过其他几种变体)。基本上,我只是将WHERE子句更改为包含前几周weekID <= $weekID
,并将Games表添加到主FROM子句LEFT JOIN sp6games ON (targetPicks.gameID = sp6games.gameID)
。
$weekID = 3; //the current weekID
$profile = 52; //the current ProfileID
SELECT
targetProfiles.profileID AS targetID,
sourceProfiles.profileID AS sourceID,
COUNT(targetProfiles.profileID)
/
(((SELECT COUNT(*) FROM sp6Picks LEFT JOIN sp6Games USING (gameID) WHERE profileID = sourceProfiles.profileID AND weekID <= $weekID)
+
(SELECT COUNT(*) FROM sp6Picks LEFT JOIN sp6Games USING (gameID) WHERE profileID = targetProfiles.profileID AND weekID <= $weekID))/2)
AS similarity
FROM
spProfiles AS sourceProfiles
LEFT JOIN
(SELECT sp6Picks.* FROM sp6Picks LEFT JOIN sp6Games USING (gameID) WHERE weekID <= $weekID) AS sourcePicks
ON (sourcePicks.profileID = sourceProfiles.profileID)
INNER JOIN
(SELECT sp6Picks.* FROM sp6Picks LEFT JOIN sp6Games USING (gameID) WHERE weekID <= $weekID) AS targetPicks
ON (sourcePicks.pick = targetPicks.pick AND sourcePicks.profileID != targetPicks.profileID)
LEFT JOIN
spProfiles AS targetProfiles
ON (targetPicks.profileID = targetProfiles.profileID)
LEFT JOIN sp6games ON (targetPicks.gameID = sp6games.gameID)
WHERE sourceProfiles.profileID = $profile
GROUP BY targetID, weekID
合并结果应为0.5000,但我得到:
$weekID = 3;
+----------+----------+------------+
| targetID | sourceID | similarity |
+----------+----------+------------+
| 53 | 52 | 0.7500 |
+----------+----------+------------+
问题是COUNT(targetProfiles.profileID)
整周没有正确计算,因此similarity
值搞砸了。对于较大的数据集,它似乎也不是很有效。
感谢您花时间阅读,并可能提供帮助。
答案 0 :(得分:2)
SELECT t.profileID AS target,
SUM(s.pick=t.pick)/COUNT(*) AS similarity
FROM sp6picks s
JOIN sp6picks t USING (gameID)
JOIN sp6games g USING (gameID)
WHERE g.weekID <= 3
AND s.profileID != t.profileID
AND s.profileID = 52
GROUP BY t.profileID
在sqlfiddle上查看。