比较表中的2行

时间:2017-04-28 07:23:46

标签: mysql sql hive

我有如下表所示的MySQL表

accountNum  date  status  action qty time
----------  ----  ------  ------ --- ----
1234        2017   filled  B      10  11:20
1234        2017   filled  S      10  11:20
2345        2017   filled  B      20  12:00
2345        2017   filled  B      10  12:00
4444        2017   filled  B       5  01:00
4444        2017   filled  S       5  02:00

这里我想比较两行动作" B"然后采取行动" S"。如果在第一个B找到2行,然后在那些记录上找到S,我必须检查accountNum,日期,时间,状态是否相同。

因此,基于上述测试数据,我应该只得到前两行

accountNum  date  status  action qty time
----------  ----  ------  ------ --- ----
1234        2017   filled  B      10  11:20
1234        2017   filled  S      10   11:20

对于这个我应该写什么类型的查询?

1 个答案:

答案 0 :(得分:1)

我会对你的钥匙进行初步统计

select  accountNum, date, status, time
from    yourTable
where   action in ('B', 'S')
group by accountNum, date, status, time
having  count(distinct action) = 2

然后,您可以将上面的内容与初始表结合使用,以仅过滤您想要的行

select  t1.*
from    yourTable t1
join    (
            select  accountNum, date, status, time
            from    yourTable
            where   action in ('B', 'S')
            group by accountNum, date, status, time
            having  count(distinct action) = 2
        ) t2
on      t1.accountNum = t2.accountNum and
        t1.date = t2.date and
        t1.status = t2.status and
        t1.time = t2.time

修改

我不是Hive的专家,但如果子查询中不允许distincthaving,您可以像这样编写查询

select  t1.*
from    yourTable t1
join    (
            select  accountNum, date, status, time, count(action) as cnt
            from    yourTable
            where   action in ('B', 'S')
            group by accountNum, date, status, time
        ) t2
on      t1.accountNum = t2.accountNum and
        t1.date = t2.date and
        t1.status = t2.status and
        t1.time = t2.time
where   t2.cnt = 2

如果相同的distinct组合不能包含同一操作的多个实例,则可以完全删除accountNum / date / time / status

having子句可以作为where条件在外部查询中移动。