我正在使用MySQL数据库,并且有一个关于计算机应用程序的基本博客类型系统......表格如下所示:
POSTS
post_id
post_created
post_type -- could be article, review, feature, whatever
post_status -- 'a' approved or 'd' for draft
APPS
app_id
app_name
app_platform -- Windows, linux, unix, etc..
APP_TO_POST -- links my posts to its relevant application
atp_id
atp_app_id
atp_post_id
我正在使用以下基本查询来为应用程序提取名称为“Photoshop”的所有文章,其中帖子类型为“文章”,文章的状态为“a”表示已批准:
SELECT apps.app_name, apps.app_platform, posts.post_created, posts.post_id
FROM apps
JOIN app_to_post ON app_to_post.atp_app_id = apps.app_id
JOIN posts ON app_to_post.atp_post_id = posts.post_id
WHERE apps.app_name = 'Photoshop'
AND
posts.post_type = 'Article'
AND
posts.post_status = 'a'
这给了我这些预期的结果:
app_name app_platform post_created post_id
Photoshop Windows Oct. 20th, 2009 1
Photoshop Windows Dec. 1, 2009 3
Photoshop Macintosh Nov. 10th, 2009 2
是否有人能够帮助我改变查询,以便只针对每个应用程序平台提取最新文章?例如,我希望我的结果看起来像这样:
app_name app_platform post_created post_id
Photoshop Windows Dec. 1, 2009 3
Photoshop Macintosh Nov. 10th, 2009 2
并省略其中一篇'Photoshop Windows'
文章,因为它不是最新的文章。
如果我只是点击MAX(post_created)
和GROUP BY app_platform
我的结果并不总是正确分组。根据我的理解,我需要执行某种子查询的内部联接?
答案 0 :(得分:4)
由于您有足够的JOIN
次,我建议先创建一个VIEW
:
CREATE VIEW articles AS
SELECT a.app_name, a.app_platform, p.post_created, p.post_id
FROM apps a
JOIN app_to_post ap ON ap.atp_app_id = a.app_id
JOIN posts p ON ap.atp_post_id = p.post_id
WHERE p.post_type = 'Article' AND p.post_status = 'a';
然后你可以使用NULL-self-join:
SELECT a1.app_name, a1.app_platform, a1.post_created, a1.post_id
FROM articles a1
LEFT JOIN articles a2 ON
a2.app_platform = a1.app_platform AND a2.post_created > a1.post_created
WHERE a2.post_id IS NULL;
测试用例:
CREATE TABLE posts (
post_id int,
post_created datetime,
post_type varchar(30),
post_status char(1)
);
CREATE TABLE apps (
app_id int,
app_name varchar(40),
app_platform varchar(40)
);
CREATE TABLE app_to_post (
atp_id int,
atp_app_id int,
atp_post_id int
);
INSERT INTO posts VALUES (1, '2010-10-06 05:00:00', 'Article', 'a');
INSERT INTO posts VALUES (2, '2010-10-06 06:00:00', 'Article', 'a');
INSERT INTO posts VALUES (3, '2010-10-06 07:00:00', 'Article', 'a');
INSERT INTO posts VALUES (4, '2010-10-06 08:00:00', 'Article', 'a');
INSERT INTO posts VALUES (5, '2010-10-06 09:00:00', 'Article', 'a');
INSERT INTO apps VALUES (1, 'Photoshop', 'Windows');
INSERT INTO apps VALUES (2, 'Photoshop', 'Macintosh');
INSERT INTO app_to_post VALUES (1, 1, 1);
INSERT INTO app_to_post VALUES (1, 1, 2);
INSERT INTO app_to_post VALUES (1, 2, 3);
INSERT INTO app_to_post VALUES (1, 2, 4);
INSERT INTO app_to_post VALUES (1, 1, 5);
结果:
+-----------+--------------+---------------------+---------+
| app_name | app_platform | post_created | post_id |
+-----------+--------------+---------------------+---------+
| Photoshop | Macintosh | 2010-10-06 08:00:00 | 4 |
| Photoshop | Windows | 2010-10-06 09:00:00 | 5 |
+-----------+--------------+---------------------+---------+
2 rows in set (0.00 sec)
作为旁注,一般来说,surrogate key不需要junction table。您也可以设置复合主键(理想情况下是引用表的外键):
CREATE TABLE app_to_post (
atp_app_id int,
atp_post_id int,
PRIMARY KEY (atp_app_id, atp_post_id),
FOREIGN KEY (atp_app_id) REFERENCES apps (app_id),
FOREIGN KEY (atp_post_id) REFERENCES posts (post_id)
) ENGINE=INNODB;
答案 1 :(得分:3)
让我们首先考虑如何从查询结果和您想要的结果中获取具有最大值的行:
您的结果:(我们称之为表T)
app_name app_platform post_created post_id
Photoshop Windows Oct. 20th, 2009 1
Photoshop Windows Dec. 1, 2009 3
Photoshop Macintosh Nov. 10th, 2009 2
您想要的结果:
app_name app_platform post_created post_id
Photoshop Windows Dec. 1, 2009 3
Photoshop Macintosh Nov. 10th, 2009 2
为了得到结果,你应该:
查询如下:
SELECT
t1.app_name,t1.app_platform,t1.post_created,t1.post_id
FROM
(SELECT app_platform, MAX(post_created) As MaxPostCreated
FROM T
GROUP BY app_platform) AS t2 JOIN
T AS t1
WHERE
t1.app_platform = t2.app_platform1
AND t2.MaxPostCreated = t1.post_created
在此查询中,子查询执行第一步,而join执行第二步。
结合您的部分答案的最终结果如下所示(带有视图):
CREATE VIEW T
SELECT a.app_name, a.app_platform, p.post_created, p.post_id
FROM apps a
JOIN app_to_post ap ON ap.atp_app_id = a.app_id
JOIN posts p ON ap.atp_post_id = p.post_id
WHERE p.post_type = 'Article' AND p.post_status = 'a';
SELECT
t1.app_name,t1.app_platform,t1.post_created,t1.post_id
FROM
(SELECT app_platform, MAX(post_created) As MaxPostCreated
FROM T
GROUP BY app_platform) AS t2 JOIN
T AS t1
WHERE
t1.app_platform = t2.app_platform1
AND t2.MaxPostCreated= t1.post_created
顺便说一句,我们的团队实际上正在开发一个试图自动帮助用户编写查询的工具,用户可以向工具提供输入输出示例,该工具将生成查询。 (查询的第一部分实际上是由工具生成的!我们原型的链接是https://github.com/Mestway/Scythe)
希望这可以帮到你。 :)
答案 2 :(得分:0)
你走在正确的轨道上。
尝试添加
group by app_name,app_platform
having post_created=max(post_created)
或者,如果您的post_id是连续的,其中较高的值将始终反映较晚的帖子,请使用以下条款:having post_id=max(post_id)