MySQL:具有连接和条件的每组最大n个

时间:2018-09-02 19:00:30

标签: mysql sql greatest-n-per-group

表结构

我有一个类似于以下表格:

场地

下表描述了企业列表

id    name
50    Nando's
60    KFC

奖励

该表描述了许多奖励,奖励所对应的地点以及兑换奖励所需的积分。

id    venue_id    name        points
1     50          5% off      10
2     50          10% off     20
3     50          11% off     30
4     50          15% off     40
5     50          20% off     50
6     50          30% off     50
7     60          30% off     70
8     60          60% off     100
9     60          65% off     120
10    60          70% off     130
11    60          80% off     140

points_data

该表描述了用户在每个场所剩余的积分数。

venue_id    points_remaining
50           30
60          90

请注意,此查询实际上是在SQL中计算的,如下所示:

select * from (
  select venue_id, (total_points - points_redeemed) as points_remaining
  from (
         select venue_id, sum(total_points) as total_points, sum(points_redeemed) as points_redeemed
         from (
                (
                  select venue_id, sum(points) as total_points, 0 as points_redeemed
                  from check_ins
                  group by venue_id
                )
                UNION
                (
                  select venue_id, 0 as total_points, sum(points) as points_redeemed
                  from reward_redemptions rr
                    join rewards r on rr.reward_id = r.id
                  group by venue_id
                )
              ) a
         group by venue_id
       ) b
  GROUP BY venue_id
) points_data

但是对于这个问题,您可能可以忽略该大量查询,并假设该表仅称为points_data

所需的输出

我想得到一个查询,得到:

  • 用户有资格在每个场所获得的前2名奖励
  • 用户尚未获得每个场所的最低2奖励

因此对于上述数据,输出为:

id    venue_id    name        points
2     50          10% off     20
3     50          11% off     30
4     50          15% off     40
5     50          20% off     50
7     60          30% off     70
8     60          60% off     100
9     60          65% off     120

我到目前为止所得到的

到目前为止,我发现的最佳解决方案是首先获取points_data,然后使用代码(即PHP)动态编写以下内容:

(
  select * from rewards
  where venue_id = 50
  and points > 30
  ORDER BY points desc
  LIMIT 2
)
union all
(
  select * from rewards
  where venue_id = 50
        and points <= 30
  ORDER BY points desc
  LIMIT 2
)
UNION ALL
(
  select * from rewards
  where venue_id = 60
        and points <= 90
  ORDER BY points desc
  LIMIT 2
)
UNION ALL
(
  select * from rewards
  where venue_id = 60
        and points > 90
  ORDER BY points desc
  LIMIT 2
)
ORDER BY venue_id, points asc;

但是,我觉得查询可能会变得太长且效率低下。例如,如果用户在400个场所中拥有积分,则为800个子查询。

我也尝试过这样的联接,但是并不能真正做到更好:

select * from points_data
INNER JOIN rewards on rewards.venue_id = points_data.venue_id
where points > points_remaining;

这远不是我想要的。

1 个答案:

答案 0 :(得分:2)

相关子查询计算较高或较低奖励的数量来确定顶部或底部条目是一种方法。

SELECT r1.*
       FROM rewards r1
            INNER JOIN points_data pd1
                       ON pd1.venue_id = r1.venue_id
       WHERE r1.points <= pd1.points_remaining
             AND (SELECT count(*)
                         FROM rewards r2
                         WHERE r2.venue_id = r1.venue_id
                               AND r2.points <= pd1.points_remaining
                               AND (r2.points > r1.points
                                     OR r2.points = r1.points
                                        AND r2.id > r1.id)) < 2
              OR r1.points > pd1.points_remaining
                 AND (SELECT count(*)
                             FROM rewards r2
                             WHERE r2.venue_id = r1.venue_id
                                   AND r2.points > pd1.points_remaining
                                   AND (r2.points < r1.points
                                         OR r2.points = r1.points
                                            AND r2.id < r1.id)) < 2
       ORDER BY r1.venue_id,
                r1.points;

SQL Fiddle

自MySQL 8.0起,使用row_number()窗口函数的解决方案将是替代方案。但我想您的版本较低。

SELECT x.id,
       x.venue_id,
       x.name,
       x.points
       FROM (SELECT r.id,
                    r.venue_id,
                    r.name,
                    r.points,
                    pd.points_remaining,
                    row_number() OVER (PARTITION BY r.venue_id,
                                                    r.points <= pd.points_remaining
                                       ORDER BY r.points DESC) rntop,
                    row_number() OVER (PARTITION BY r.venue_id,
                                                    r.points > pd.points_remaining
                                       ORDER BY r.points ASC) rnbottom
                    FROM rewards r
                         INNER JOIN points_data pd
                                    ON pd.venue_id = r.venue_id) x
       WHERE x.points <= x.points_remaining
             AND x.rntop <= 2
              OR x.points > x.points_remaining
                 AND x.rnbottom <= 2
       ORDER BY x.venue_id,
                x.points;

db<>fiddle

这里最棘手的部分是将每个地点的集合也划分为子集,在该子集中,用户的积分足以兑现奖励;在子集中,积分不足。但是,就像在MySQL逻辑表达式中计算为0或1(在非布尔上下文中)一样,相应的表达式也可以用于此。