我有以下蜂巢表
Table_1
ID
1
1
2
Table_2
ID
1
2
2
我根据两个表中的ID计数比较两个表,我需要输出如下
ID
1 - 2records in table 1 and 1 record in Table 2
2 - one record in Table 1 and 2 records in table 2
Table_1是父表
我正在使用以下查询
select count(*),ID from Table_1 group by ID;
select count(*),ID from Table_2 group by ID;
答案 0 :(得分:0)
只需对查询执行完全外连接,并将on条件设置为X.id = Y.id,然后从结果表中选择*,检查任意一方的空值。
Select id, concat(cnt1, " entries in table 1, ",cnt2, "entries in table 2") from (select * from (select count(*) as cnt1, id from table1 group by id) X full outer join (select count(*) as cnt2, id from table2 group by id)
on X.id=Y.id
)
答案 1 :(得分:0)
试试这个。您可以使用案例陈述来检查它是否应该是记录/记录 s 等。
SELECT m.id,
CONCAT (COALESCE(a.ct, 0), ' record in table 1, ', COALESCE(b.ct, 0),
' record in table 2')
FROM (SELECT id
FROM table_1
UNION
SELECT id
FROM table_2) m
LEFT JOIN (SELECT Count(*) AS ct,
id
FROM table_1
GROUP BY id) a
ON m.id = a.id
LEFT JOIN (SELECT Count(*) AS ct,
id
FROM table_2
GROUP BY id) b
ON m.id = b.id;
答案 2 :(得分:0)
您可以使用此Python程序对2个Hive表进行完整比较: https://github.com/bolcom/hive_compared_bq
如果你想根据计数进行快速比较,那么传递“--just-count”选项(你也可以用“ - group-by-column”列指定分组)。
如果您想要完整的验证,该脚本还允许您直观地查看所有行和所有列的所有差异。