我刚刚完成了使用PHP创建RSS Feed以从数据库中获取数据的最新任务。
我只是注意到很多(如果不是全部)这些项目都有重复项,我试图找出如何只获取其中一项。
我有一种想法,在我的PHP循环中,我只能打印出每一秒的每一行只有一组重复,但在某些情况下,每篇文章有3或4个,所以不知何故必须通过查询来实现
查询:
SELECT *
FROM uk_newsreach_article t1
INNER JOIN uk_newsreach_article_photo t2
ON t1.id = t2.newsArticleID
INNER JOIN uk_newsreach_photo t3
ON t2.newsPhotoID = t3.id
ORDER BY t1.publishDate DESC;
表结构:
uk_newsreach_article
--------------------
id | headline | extract | text | publishDate | ...
uk_newsreach_article_photo
--------------------------
id | newsArticleID | newsPhotoID
uk_newsreach_photo
------------------
id | htmlAlt | URL | height | width | ...
由于某种原因,有很多重复项,每组数据中唯一真正唯一的是uk_newsreach_article_photo.id
,因为即使uk_newsreach_article_photo.newsArticleID
和uk_newsreach_article_photo.newsPhotoID
在一组中是相同的重复,我需要的是每一组中的一个,例如
示例数据
id | newsArticleID | newsPhotoID
--------------------------------
2 | 800482746 | 7044521
10 | 800482746 | 7044521
19 | 800482746 | 7044521
29 | 800482746 | 7044521
39 | 800482746 | 7044521
53 | 800482746 | 7044521
67 | 800482746 | 7044521
我尝试在查询中添加DISTINCT
并指定我想要的实际列但这不起作用。
答案 0 :(得分:1)
group by
所有选定的HAVING COUNT(*) > 1
列都会消除所有重复的内容:
SELECT *
FROM uk_newsreach_article t1
INNER JOIN uk_newsreach_article_photo t2
ON t1.id = t2.newsArticleID
INNER JOIN uk_newsreach_photo t3
ON t2.newsPhotoID = t3.id
GROUP BY t1.id, t1.headline, t1.extract, t1.text, t1.publishDate,
t2.id, t2.newsArticleID, t2.newsPhotoID,
t3.id, t3.htmlAlt, t3.URL, t3.height, t3.width
HAVING COUNT(*) > 1
ORDER BY t1.publishDate DESC;
答案 1 :(得分:1)
正如您所注意到的,DISTINCT
运算符将返回每个id。您可以使用GROUP BY
代替。
您必须决定要保留id
。{在示例中,我使用了MIN
,但任何聚合函数都可以。
SELECT MIN(t1.id), t2.newsArticleID, t2.newsPhotoID
FROM uk_newsreach_article t1
INNER JOIN uk_newsreach_article_photo t2
ON t1.id = t2.newsArticleID
INNER JOIN uk_newsreach_photo t3
ON t2.newsPhotoID = t3.id
GROUP BY t2.newsArticleID, t2.newsPhotoID
ORDER BY t1.publishDate DESC;
现在虽然这对您的直接问题来说是一个简单的解决方案,如果您认为重复不应该发生,您真的应该考虑重新设计表格,以防止重复进入您的表格