具有多个条件的SQL组

时间:2014-02-12 21:34:26

标签: mysql sql greatest-n-per-group

我正在处理访客日志数据,需要通过IP地址进行汇总。数据如下所示:

id        | ip_address     | type     | message  | ...
----------+----------------+----------+----------------
1         | 1.2.3.4        | purchase | ...
2         | 1.2.3.4        | visit    | ...
3         | 3.3.3.3        | visit    | ...
4         | 3.3.3.3        | purchase | ...
5         | 4.4.4.4        | visit    | ...
6         | 4.4.4.4        | visit    | ...

应该总结一下:

type="purchase" DESC, type="visit" DESC, id DESC

收益率:

chosenid  | ip_address     | type     | message  | ...
----------+----------------+----------+----------------
1         | 1.2.3.4        | purchase | ...
4         | 3.3.3.3        | purchase | ...
6         | 4.4.4.4        | visit    | ...

有一种优雅的方式来获取这些数据吗?


一种丑陋的方法如下:

set @row_num = 0; 
CREATE TEMPORARY TABLE IF NOT EXISTS tt AS 
SELECT *,@row_num:=@row_num+1 as row_index FROM log ORDER BY type="purchase" DESC, type="visit" DESC, id DESC
ORDER BY rating desc;

然后获取每个ip_address(https://stackoverflow.com/questions/121387/fetch-the-row-which-has-the-max-value-for-a-column)的最小row_index和id

然后将这些id加回到原始表

3 个答案:

答案 0 :(得分:1)

我认为这应该是你所需要的:

SELECT yourtable.*
FROM
  yourtable INNER JOIN (
    SELECT   ip_address,
             MAX(CASE WHEN type='purchase' THEN id END) max_purchase,
             MAX(CASE WHEN type='visit' THEN id END) max_visit
    FROM     yourtable
    GROUP BY ip_address) m
  ON yourtable.id = COALESCE(max_purchase, max_visit)

请参阅小提琴here

我的子查询将返回最大购买ID(如果没有购买则返回null)和最大访问ID。然后我用COALESCE加入表,如果max_purchase不为null,则连接将在max_purchase上,否则它将在max_visit上。

答案 1 :(得分:0)

您可以在此处使用Bill Karwin's approach

SELECT t1.*
FROM (SELECT *, CASE WHEN type = 'purchase' THEN 1 ELSE 0 END is_purchase FROM myTable) t1
LEFT JOIN (SELECT *, CASE WHEN type = 'purchase' THEN 1 ELSE 0 END is_purchase FROM myTable) t2
  ON t1.ip_address = t2.ip_address
  AND (t2.is_purchase > t1.is_purchase
     OR (t2.is_purchase = t1.is_purchase AND t2.id > t1.id))
WHERE t2.id IS NULL 

SQL小提琴here

答案 2 :(得分:0)

以下查询通过使用相关子查询根据您的规则获取最新id

select t.ip_adddress,
       (select t2.id
        from table t2
        where t2.ip_address = t1.ip_address
        order by type = 'purchase' desc, id desc
        limit 1
       ) as mostrecent
from (select distinct t.ip_address
      from table t
     ) t;

我们的想法是先通过购买(ID也下降)对数据进行排序,然后按访问排序并选择列表中的第一个数据。如果你有一张ipaddresses表,那么你就不需要distinct子查询。只需使用该表。

要获得最终结果,我们可join对此进行操作或使用inexists。这使用in

select t.*
from table t join
     (select id, (select t2.id
                  from table t2
                  where t2.ip_address = t1.ip_address
                  order by type = 'purchase' desc, id desc
                  limit 1
                 ) as mostrecent
      from (select distinct t.ip_address
            from table t
           ) t
     ) ids
     on t.id = ids.mostrecent;

如果table(ip_address, type, id)上有索引,此查询最有效。