SQL按百分比获取类似的“匹配”结果

时间:2013-11-06 22:35:12

标签: mysql left-join union ifnull

此表存储用户匹配之间的用户投票。总有一个赢家,一个输家和选民。

CREATE TABLE `user_versus` (
  `id_user_versus` int(11) NOT NULL AUTO_INCREMENT,
  `id_user_winner` int(10) unsigned NOT NULL,
  `id_user_loser` int(10) unsigned NOT NULL,
  `id_user` int(10) unsigned NOT NULL,
  `date_versus` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY (`id_user_versus`),
  KEY `id_user_winner` (`id_user_winner`,`id_user_loser`)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8 AUTO_INCREMENT=17 ;

INSERT INTO `user_versus` (`id_user_versus`, `id_user_winner`, `id_user_loser`, `id_user`, `date_versus`) VALUES
(1, 6, 7, 1, '2013-10-25 23:02:57'),
(2, 6, 8, 1, '2013-10-25 23:02:57'),
(3, 6, 9, 1, '2013-10-25 23:03:04'),
(4, 6, 10, 1, '2013-10-25 23:03:04'),
(5, 6, 11, 1, '2013-10-25 23:03:10'),
(6, 6, 12, 1, '2013-10-25 23:03:10'),
(7, 6, 13, 1, '2013-10-25 23:03:18'),
(8, 6, 14, 1, '2013-10-25 23:03:18'),
(9, 7, 6, 2, '2013-10-26 04:02:57'),
(10, 8, 6, 2, '2013-10-26 04:02:57'),
(11, 9, 8, 2, '2013-10-26 04:03:04'),
(12, 9, 10, 2, '2013-10-26 04:03:04'),
(13, 9, 11, 2, '2013-10-26 04:03:10'),
(14, 9, 12, 2, '2013-10-26 04:03:10'),
(15, 9, 13, 2, '2013-10-26 04:03:18'),
(16, 9, 14, 2, '2013-10-26 04:03:18');

我正在处理一个获取类似配置文件的查询。投票百分比(胜利与失败)是指定资料的+/- 10%时,资料类似。

SELECT id_user_winner AS id_user,
    IFNULL(wins, 0) AS wins,
    IFNULL(loses, 0) AS loses,
    IFNULL(wins, 0) + IFNULL(loses, 0) AS total,
    IFNULL(wins, 0) / (IFNULL(wins, 0) + IFNULL(loses, 0)) AS percent
FROM
(
    SELECT id_user_winner AS id_user FROM user_versus 
    UNION
    SELECT id_user_loser FROM user_versus 
) AS u
LEFT JOIN
(
    SELECT id_user_winner, COUNT(*) AS wins
    FROM user_versus
    GROUP BY id_user_winner
) AS w
ON u.id_user = id_user_winner
LEFT JOIN
(
    SELECT id_user_loser, COUNT(*) AS loses
    FROM user_versus
    GROUP BY id_user_loser
) AS l
ON u.id_user = l.id_user_loser

这是目前的结果:

mysql result

它当前正在返回NULL行,它们不应该在那里。仍然需要优化(并且不能完全指责它)的是:

  1. 仅为用户提供与用户ABC相似的内容
  2. 指定用于定义与其相似的用户的条件,例如user id = 6(其中类似用户与用户ID 6的百分比差异为+/- 10%)
  3. 任何帮助将不胜感激。谢谢!

1 个答案:

答案 0 :(得分:1)

要计算每个用户的输赢而不必将表连接到自身并使用OUTER连接,可以单独选择胜负,并在它们之间执行UNION ALL,但如果给定行代表其他信息为用户赢得胜利,或者是亏损。

然后,很容易计算每个用户的所有胜负。棘手的部分是合并选项,以指定您想要比较配置文件的用户。我使用一个变量设置为给定percentage的用户的user_id值,您可以将其从常量更改为变量。

这是我的建议(与id = 6的用户比较):

SELECT
    player_id AS id_user,
    wins,
    losses,
    wins + losses AS total,
    wins / (wins + losses) AS percent
  FROM (
    SELECT
        player_id,
        SUM(is_a_win) wins,
        SUM(is_a_loss) losses,
        CASE
          WHEN player_id = 6
            THEN @the_user_score := SUM(is_a_win) / (SUM(is_a_win) + SUM(is_a_loss))
          ELSE NULL
         END
      FROM (
        SELECT id_user_winner AS player_id, 1 AS is_a_win, 0 AS is_a_loss FROM user_versus
        UNION ALL SELECT id_user_loser, 0, 1 FROM user_versus
      ) games
    GROUP BY player_id
  ) data
WHERE
  ABS(wins / (wins + losses) - @the_user_score) <= 0.1
;

输出:

ID_USER WINS   LOSSES  TOTAL   PERCENT
6       8       2       10      0.8
9       6       1       7       0.8571

当然,您可以通过将player_id != 6(或者,在最终解决方案中,某些变量名称)条件添加到最外面的WHERE子句中来删除其配置文件作为比较基础的用户。

SQLFiddle上的示例:Matching Profiles - Example

如果这是你想要的,你能否提供一些反馈,如果没有,你会期望什么输出?