我有4张桌子:
Table talks
table talks_fan
table talks_follow
table talks_comments
我想要实现的目标是为每一次谈话计算所有评论,粉丝和粉丝。
到目前为止我想出了这个。
所有tables
都有talk_id
,且只有talks
表中的主键是
SELECT
g. *,
COUNT( m.talk_id ) AS num_of_comments,
COUNT( f.talk_id ) AS num_of_followers
FROM
talks AS g
LEFT JOIN talks_comments AS m
USING ( talk_id )
LEFT JOIN talks_follow AS f
USING ( talk_id )
WHERE g.privacy = 'public'
GROUP BY g.talk_id
ORDER BY g.created_date DESC
LIMIT 30;
我也尝试过使用这种方法
SELECT
t.*,
COUNT(b.talk_id) AS comments,
COUNT(bt.talk_id) AS followers
FROM
talks t
LEFT JOIN talks_follow bt
ON bt.talk_id = t.talk_id
LEFT JOIN talks_comments b
ON b.talk_id = t.talk_id
GROUP BY t.talk_id;
两者都给我相同的结果....?!
更新:创建语句
CREATE TABLE IF NOT EXISTS `talks` (
`talk_id` bigint(20) NOT NULL AUTO_INCREMENT,
`user_id` mediumint(9) NOT NULL,
`title` varchar(255) NOT NULL,
`content` text NOT NULL,
`created_date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`privacy` enum('public','private') NOT NULL DEFAULT 'private',
PRIMARY KEY (`talk_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=7 ;
CREATE TABLE IF NOT EXISTS `talks_comments` (
`comment_id` bigint(20) NOT NULL AUTO_INCREMENT,
`talk_id` bigint(20) NOT NULL,
`user_id` mediumint(9) NOT NULL,
`comment` text NOT NULL,
`date_created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`status` tinyint(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`comment_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=8 ;
CREATE TABLE IF NOT EXISTS `talks_fan` (
`fan_id` bigint(20) NOT NULL AUTO_INCREMENT,
`talk_id` bigint(20) NOT NULL,
`user_id` bigint(20) NOT NULL,
`created_date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`status` tinyint(1) NOT NULL DEFAULT '1',
PRIMARY KEY (`fan_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=4 ;
CREATE TABLE IF NOT EXISTS `talks_follow` (
`follow_id` bigint(20) NOT NULL AUTO_INCREMENT,
`talk_id` bigint(20) NOT NULL,
`user_id` mediumint(9) NOT NULL,
`date_created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`follow_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=5 ;
有效的最终查询
SELECT t.* , COUNT( DISTINCT b.comment_id ) AS comments,
COUNT( DISTINCT bt.follow_id ) AS followers,
COUNT( DISTINCT c.fan_id ) AS fans
FROM talks t
LEFT JOIN talks_follow bt ON bt.talk_id = t.talk_id
LEFT JOIN talks_comments b ON b.talk_id = t.talk_id
LEFT JOIN talks_fan c ON c.talk_id = t.talk_id
WHERE t.privacy = 'public'
GROUP BY t.talk_id
ORDER BY t.created_date DESC
LIMIT 30
编辑:整个问题的最终答案......
我已经修改了Query并在PHP(Codeigniter)中创建了一些代码以解决我的问题,即@Bill Karwin的推荐
$sql="
SELECT t.*,
COUNT( DISTINCT b.comment_id ) AS comments,
COUNT( DISTINCT bt.follow_id ) AS followers,
COUNT( DISTINCT c.fan_id ) AS fans,
GROUP_CONCAT( DISTINCT c.user_id ) AS list_of_fans
FROM talks t
LEFT JOIN talks_follow bt ON bt.talk_id = t.talk_id
LEFT JOIN talks_comments b ON b.talk_id = t.talk_id
LEFT JOIN talks_fan c ON c.talk_id = t.talk_id
WHERE t.privacy = 'public'
GROUP BY t.talk_id
ORDER BY t.created_date DESC
LIMIT 30
";
$query = $this->db->query($sql);
if($query->num_rows() > 0)
{
$results = array();
foreach($query->result_array() AS $talk){
$fan_user_id = explode(",", $talk['list_of_fans']);
foreach($fan_user_id AS $user){
if($user == 1 /* this supposed to be user id or session*/){
$talk['list_of_fans'] = 'yes';
}
}
$follower_user_id = explode(",", $talk['list_of_follower']);
foreach($follower_user_id AS $user){
if($user == 1 /* this supposed to be user id or session*/){
$talk['list_of_follower'] = 'yes';
}
}
$results[] = array(
'talk_id' => $talk['talk_id'],
'user_id' => $talk['user_id'],
'title' => $talk['title'],
'created_date' => $talk['created_date'],
'comments' => $talk['comments'],
'followers' => $talk['followers'],
'fans' => $talk['fans'],
'list_of_fans' => $talk['list_of_fans'],
'list_of_follower' => $talk['list_of_follower']
);
}
}
我仍然相信它可以在数据库中得到优化并且只能使用结果......
我在想如果每个TALK有1000个粉丝和2000个粉丝,那么结果将需要更长的时间才能加载..如果你不喜欢10个或者10个错误听到......
编辑:为查询测试添加基准...
我使用了codeigniter profiler来了解查询完成排除所需的时间。
据说我也开始在表格中添加数据
结果如下。
在向其提交数据后测试数据库
Query Results time
table Talks
---------------
table data 50 rows.
Time: 0.0173 seconds
Table Rows: 644 rows
Time: 0.0535 seconds
Table Rows: 1250 rows
Time: 0.0856 seconds
Adding data to other tables
--------------------------
Talks = 1250 rows
talks_follow = 4115
talks_fan = 10 rows
Time: 2.656 seconds
Adding data to other tables
--------------------------
Talks = 1250 rows
talks_follow = 4115
talks_fan = 10 rows
talks_comments = 3650 rows
Time: 10.156 seconds
After replacing LEFT JOIN with STRAIGHT_JOIN
Time: 6.675 seconds
它似乎对DB非常沉重...... 现在我正在进入另一个如何提高其绩效的困境
编辑:使用@leonardo_assumpcao建议
After rebuilding the DB using @leonardo_assumpcao suggestion
for indexing few fields..........
Adding data to other tables
--------------------------
Talks = 6000 Rows
talks_follow = 10000 Rows
talks_fan = 10000 Rows
talks_comments = 10000 Rows
Time: 17.940 second
重数据DB这是正常的吗??
答案 0 :(得分:0)
您可以将此强制转换为一个查询,如下所示:
SELECT COUNT(*) num, 'talks' item FROM talks
UNION
SELECT COUNT(*) num, 'talks_fan' item FROM talks_fan
UNION
SELECT COUNT(*) num, 'talks_follow' item FROM talks_follow
UNION
SELECT COUNT(*) num, 'talks_comment' item FROM talks_comment
这将为您提供五行结果集,每个表一行。每行都是特定表中的计数。
如果你必须将它全部放入一行,你可以像这样做一个支点。
SELECT
SUM( CASE item WHEN 'talks' THEN num ELSE 0 END ) AS 'talks',
SUM( CASE item WHEN 'talks_fan' THEN num ELSE 0 END ) AS 'talks_fan',
SUM( CASE item WHEN 'talks_follow' THEN num ELSE 0 END ) AS 'talks_follow',
SUM( CASE item WHEN 'talks_comment' THEN num ELSE 0 END ) AS 'talks_comment'
FROM
( SELECT COUNT(*) num, 'talks' item FROM talks
UNION
SELECT COUNT(*) num, 'talks_fan' item FROM talks_fan
UNION
SELECT COUNT(*) num, 'talks_follow' item FROM talks_follow
UNION
SELECT COUNT(*) num, 'talks_comment' item FROM talks_comment
) counts
(这不会考虑您的WHERE g.privacy =
子句,因为我不明白。但您可以在{{1}中的四个查询中的一个查询中添加WHERE
子句处理它的项目。)
请注意,对于强制执行单个查询的四个单独的表,这确实是四个查询。
顺便说一下,当UNION
是表的主键时,COUNT(*)
和COUNT(id)
之间的值没有差异。 id
不计算COUNT(id)
为id
的行,但如果NULL
是主键,那么它是id
。但是NOT NULL
更快,所以请使用它。
编辑如果您需要每个不同谈话的粉丝数量,关注和评论行,请执行此操作。这与做联合和枢轴的想法相同,但有一个额外的参数。
COUNT(*)
在(太多年)这样做之后,我发现描述你需要的查询的最好方法是对自己说“我需要一个结果集,每个xxx有一行,yyy的列, zzz和qqq。“
答案 1 :(得分:0)
计数相同的原因是它在连接组合表之后计算行数。通过加入多个表,您将创建一个Cartesian product。
基本上,你不仅要计算每次谈话有多少评论,还要计算每次演讲有多少评论*粉丝。然后你会根据每个谈话有多少粉丝*评论来关注粉丝。因此,计数是相同的,而且它们都太高了。
这是一种更简单的方法来编写查询来统计每个不同的评论,关注者等只有一次:
SELECT t.*,
COUNT(DISTINCT b.comment_id) AS comments,
COUNT(DISTINCT bt.follow_id) AS followers
FROM talks t
LEFT JOIN talks_follow bt ON bt.talk_id = t.talk_id
LEFT JOIN talks_comments b ON b.talk_id = t.talk_id
GROUP BY t.talk_id;
重新评论:我不会在同一个查询中获取所有关注者。你可以这样做:
SELECT t.*,
COUNT(DISTINCT b.comment_id) AS comments,
COUNT(DISTINCT bt.follow_id) AS followers,
GROUP_CONCAT(DISTINCT bt.follower_name) AS list_of_followers
FROM talks t
LEFT JOIN talks_follow bt ON bt.talk_id = t.talk_id
LEFT JOIN talks_comments b ON b.talk_id = t.talk_id
GROUP BY t.talk_id;
但是你得到的是一个单独的字符串,跟随者的名字用逗号分隔。现在你必须编写应用程序代码以在逗号上拆分字符串,你必须担心一些关注者名称是否实际上包含逗号,依此类推。
我会做第二个查询,为特定的谈话取得粉丝。无论如何,您可能只想为特定的谈话显示关注者。
SELECT follower_name
FROM talks_follow
WHERE talk_id = ?
答案 2 :(得分:0)
我可以说这是(至少)我今天改进的最酷的选择陈述之一。
SELECT STRAIGHT_JOIN
t.* ,
COUNT( DISTINCT b.comment_id ) AS comments,
COUNT( DISTINCT bt.follow_id ) AS followers,
COUNT( DISTINCT c.fan_id ) AS fans
FROM
(
SELECT * FROM talks
WHERE privacy = 'public'
ORDER BY created_date DESC
LIMIT 0, 30
) AS t
LEFT JOIN talks_follow bt ON (bt.talk_id = t.talk_id)
LEFT JOIN talks_comments b ON (b.talk_id = t.talk_id)
LEFT JOIN talks_fan c ON (c.talk_id = t.talk_id)
GROUP BY t.talk_id ;
但在我看来,你的问题存在于你的桌子上;获得有效查询的第一步是索引所需连接所涉及的每个字段。
我对你上面的表格进行了一些修改;您可以看到其代码here (已更新) 很有意思,不是吗?既然我们在这里,也请考虑你的ERR模型:
首先使用MySQL测试数据库进行尝试。希望它能解决您的性能问题。
(原谅我的英文,这是我的第二语言)