假设我有来自多个人的多组报告。如何识别这些数据之间的更改并确定将哪些数据合并到特定数据库。
适用的情景
数据1
Date Sales Revenue
2016-01-01 27 30
2016-01-03 12 10
2016-01-04 48 50
数据2
Date Sales Revenue
2016-01-01 27 10
2016-01-02 31 40
2016-01-04 48 50
期望的结果
Date Sales T1 Revenue T2 Revenue
2016-01-01 27 30 10
2016-01-02 31 NULL 40
2016-01-03 12 10 NULL
2016-01-04 48 50 50
我尝试了各种方法,包括UNION和JOIN的组合,现在似乎对我没什么用。
我现在最接近的是以下内容。
SELECT d1.date,
d1.sales,
d1.revenue AS T1,
d2.revenue AS T2
FROM dataset1 d1
RIGHT JOIN dataset2 d2 ON d1.date = d2.date
WHERE d1.revenue <> d2.revenue
OR (d1.revenue IS NOT NULL AND d2.revenue IS NULL)
OR (d1.revenue IS NULL AND d2.revenue IS NOT NULL)
左连接/右连接之间的跳转仅取决于哪一侧有缺失数据。
搜索了该网站但未找到适用于我的解决方案= /
答案 0 :(得分:1)
SELECT x.*
, d1.revenue t1_revenue
, d2.revenue t2_revenue
FROM (SELECT date, sales FROM data1
UNION
SELECT date, sales FROM data2
) x
LEFT
JOIN data1 d1
ON d1.date = x.date
LEFT
JOIN data2 d2
ON d2.date = x.date
ORDER
BY date;
答案 1 :(得分:0)
您应该使用full join
。
SELECT coalesce(d1.date,d2.date) dt,
coalesce(d1.sales,d2.sales) sales,
d1.revenue AS T1Revenue,
d2.revenue AS T2Revenue
FROM dataset1 d1
FULL JOIN dataset2 d2 ON d1.date = d2.date
当列在任一给定表中不存在时,使用coalesce
获取列的非空值。
由于MySQL不支持full join
,因此可以将left
和right
联接与union
结合使用来完成此操作。
SELECT d1.date dt,
d1.sales sales,
d1.revenue AS T1Revenue,
d2.revenue AS T2Revenue
FROM dataset1 d1
LEFT JOIN dataset2 d2 ON d1.date = d2.date
UNION
SELECT d2.date dt,
d2.sales sales,
d1.revenue AS T1Revenue,
d2.revenue AS T2Revenue
FROM dataset1 d1
RIGHT JOIN dataset2 d2 ON d1.date = d2.date
ORDER BY 1
答案 2 :(得分:0)
一种方法是蛮力:
select 'd1' as which, d1.*
from data1 d1
where not exists (select 1
from data2 d2
where d1.date = d2.date and d1.revenue <=> d2.revenue
)
union all
select 'd2' as which, d2.*
from data1 d2
where not exists (select 1
from data1 d1
where d1.date = d2.date and d1.revenue <=> d2.revenue
);
您的示例查询仅比较revenue
,但您可以使用相同的逻辑来比较sales
。请注意,<=>
是NULL
- 安全比较运算符。