计算join语句中不同表的行数

时间:2015-06-11 11:46:15

标签: mysql

我有2张桌子 - 评论和评分。注释表包含一列reply,表示注释是否是对另一个注释的回复。评级表包含comment_id, user_id, rating

形式的评论评分

当我选择要显示的评论时,它有点复杂,所以我会尝试尽可能地简化

SELECT
COALESCE(SUM(cr.vote), 0) AS rating,
COUNT(r.id) AS replies

FROM comments c 
LEFT JOIN comments_ratings cr ON c.id = cr.comment
LEFT JOIN comments r ON c.id = r.reply

WHERE c.id = 1

GROUP BY c.id;

这是测试设置

CREATE TABLE `comments` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `text` text NOT NULL,
  `author` int(10) unsigned NOT NULL,
  `time` datetime DEFAULT NULL,
  `reply` int(10) unsigned DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `reply` (`reply`),
  CONSTRAINT `comments_ibfk_1` FOREIGN KEY (`reply`) REFERENCES `comments` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8

CREATE TABLE `comments_ratings` (
  `comment` int(10) unsigned NOT NULL,
  `user` int(10) unsigned NOT NULL,
  `vote` tinyint(4) NOT NULL,
  PRIMARY KEY (`comment`,`user`),
  KEY `user` (`user`),
  CONSTRAINT `comments_ratings_ibfk_1` FOREIGN KEY (`comment`) REFERENCES `comments` (`id`) ON DELETE CASCADE ON UPDATE NO ACTION,
  -- CONSTRAINT `comments_ratings_ibfk_2` FOREIGN KEY (`user`) REFERENCES `users` (`id`) ON DELETE CASCADE ON UPDATE NO ACTION
) ENGINE=InnoDB DEFAULT CHARSET=utf8

INSERT INTO comments (id, reply, text, author) VALUES (1, null, '', 0), (null, 1, '', 0),(null, 1, '', 0),(null, 1, '', 0);
INSERT INTO comments_ratings (comment, user, vote) VALUES (1, 1, 1);

现在,如果你执行select语句,你会看到rating变为3,即使comments_ratings中只有1条记录值为1.如果我添加另一个回复它将变为4.如果你添加另一个comments_ratings记录的值为1,它将加倍并变为8.这是因为连接中的每一行都是在它没有的字段中复制信息。

你能否帮助我在r设置联接,这样它就不会使评分加倍并回复。

3 个答案:

答案 0 :(得分:1)

一种方法是在连接之前预先聚合数据。像这样:

FROM comments c LEFT JOIN
     (SELECT cr.comment, SUM(cr.vote) as vote
      FROM comments_ratings cr
      GROUP BY cr.comment
     ) cr
     ON c.id = cr.comment LEFT JOIN
     comments r
     ON c.id = r.reply

出于性能原因,您可能还希望在子查询中包含过滤条件。

答案 1 :(得分:1)

当你从一些子表到一个超级表有一些LEFT JOIN时,你应该记住你的超级表的行将被两个子表重复,所以你应该将你的查询更改为像这样的东西:

SELECT
    COALESCE(SUM(cr.vote), 0) AS rating,
    COALESCE(SUM(r.cnt), 0) AS replies
FROM 
    comments c 
LEFT JOIN 
    (SELECT
        cri.comment,
        SUM(cri.vote) As vote
     FROM
        comments_ratings cri
     GROUP BY
        cri.comment
    )cr ON c.id = cr.comment
LEFT JOIN 
    (SELECT  
        ci.reply,
        COUNT(ci.id) cnt
     FROM 
        comments ci
     GROUP BY
        ci.reply
    ) AS r ON c.id = r.reply
WHERE 
    c.id = 1
GROUP BY 
    c.id;

答案 2 :(得分:0)

更新:尽管两个答案都是正确的,但我目前正在使用大量数据对此设置进行测试,而且性能很差。经过短暂的调查后我确定了原因 - 基本上建议的解决方案在内存中创建一个临时表,其中表中的所有数据都填写在每个查询上,因为此时数据量的增加也会增加,而且在一个相当弱的服务器上在运行中,我获得了超过5秒的查询时间,持续了几千行。

我已经找到了解决该问题的方法,它仍然使用临时表,但不是将整个表复制到临时表中,而是只复制正在选择的记录范围,这里是:

SELECT
    c.*,
  COUNT(r.id) AS replies
FROM
    (
        SELECT
            c.id,
            c.text,
            c.time,
            c.author AS author_id,
            SUM(cr.vote) AS rating,
            crv.vote AS voted
        FROM
            comments c
        LEFT JOIN users u ON u.id = c.author
        LEFT JOIN comments_ratings cr ON cr. COMMENT = c.id
        LEFT JOIN comments_ratings crv ON crv. COMMENT = c.id
        AND crv. USER = ?
        WHERE
            c.item = ?
        AND c.type = ?
        AND c.id < ?
        GROUP BY
            c.id
        ORDER BY
            c.id DESC
        LIMIT 0,
        100
    ) AS c
LEFT JOIN comments r ON c.id = r.reply
GROUP BY
    c.id
ORDER BY
    c.id DESC

我在表中测试了这个方法有400多万条记录,并且在一台相当弱的服务器上,查询在不到10毫秒的时间内执行。