当联接表可以具有一个或多个记录时,如何联接关联表?

时间:2019-07-23 18:42:16

标签: mysql sql mariadb

我有两个表:

posts (id, published_at)
posts_images (id, post_id, image_url(null or url string))

每个帖子至少具有1个posts_images记录,并且也可以包含多个。

我的目标:查询显示的是具有1张或多张图片的帖子所占的百分比,按星期(7天)细分。

这是我的查询:

SELECT  floor(datediff(p.created_at, curdate()) / 7) AS weeks_ago,
        date(min(p.created_at)) AS "Date Start",
        date(max(p.created_at)) AS "Date End",
        count(DISTINCT p.id) AS "Posts in Cohort"
        count(pc.image_url) / count(p.id) AS "Post w 1 or more Images Ratio",
FROM    posts p
        INNER JOIN posts_images pc
            ON p.id = pc.post_id
WHERE p.published_at IS NOT NULL
GROUP BY weeks_ago
ORDER BY weeks_ago DESC;

查询运行正常并输出数据,但是由于帖子具有1个或多个posts_images,因此我不确定我是否在正确执行JOIN。我担心SQL会选择第一个posts_images记录,而不是全部查看。

我这样做正确吗?

2 个答案:

答案 0 :(得分:3)

我认为您最好采用两种聚合级别:

SELECT  floor(datediff(p.created_at, curdate()) / 7) AS weeks_ago,
        date(min(p.created_at)) AS "Date Start",
        date(max(p.created_at)) AS "Date End",
        count(*) as  "Posts in Cohort",
        avg(has_image) as "Post w 1 or more Images Ratio",
FROM (SELECT p.id, p.created_at,
             ( MAX(pi.image_url) IS NOT NULL ) as has_image
      FROM posts p JOIN
           posts_images pi
           ON p.id = pi.post_id
      WHERE p.published_at IS NOT NULL
      GROUP BY p.id
     ) p
GROUP BY weeks_ago
ORDER BY weeks_ago DESC;

答案 1 :(得分:1)

我将从发现多个图像的情况开始:

SELECT post_id, COUNT(*) AS ct
    FROM posts_images
    GROUP BY post_id
    HAVING ct > 1

然后,我将去posts查找所涉及的星期:

SELECT  floor(datediff(p.created_at, curdate()) / 7) AS weeks_ago
        date(min(p.created_at)) AS "Date Start",
        date(max(p.created_at)) AS "Date End",
        count(*) AS "Posts in Cohort"
        ROUND(SUM(x.ct) / count(*), 3) AS "Post w 1 or more Images Ratio",
    FROM ( .. the query above .. ) AS x
    JOIN posts AS p  ON x.post_id = p.id
    GROUP BY weeks_ago
    ORDER BY weeks_ago DESC;

与您的方法相比,优点是中间临时表较小(每个帖子一行,而每个图像一行)。

潜在问题:

  • 日期起止日期可能不完全是几天。可以通过从FLOOR向后工作以获取“星期”的开始/结束来解决此问题。
  • 缺少几周。这将需要另一张包含所有星期的表格,以及混乱的LEFT JOIN