根据混合日期填写缺失的行

时间:2019-06-28 17:59:17

标签: mysql sql etl informatica informatica-cloud

我正在Informatica IICS中执行映射,并尝试基于几个字段填充数据集中的缺失行。

下面是数据的样本表。有一个ID字段,一个Week_Start字段(该字段是报告数据的一周的开始日期),一个相应的Week_Number和一个Year字段,它们指定了属于上一年或当年的数据。 Sales是通过该特定ID进行的销售数量,Sales_Type是销售类别。

但是,在某些日期特定的人没有进行销售,因此缺少与该数据相对应的行。我想用所有相关信息填写这些行,并将Sales字段设置为0。

我的实际数据有6周的信息窗口,其中包含上一年和本年度的7种不同销售类型的信息。因此,我希望每个ID有6x2x7 = 84行。也就是说,如果我有100个唯一ID,则我的最终表应具有8400行。

缺少行的表:

+----+------------+-------------+---------+-------+------------+
| ID | Week_Start | Week_Number |  Year   | Sales | Sales_Type |
+----+------------+-------------+---------+-------+------------+
|  1 | 01/01/2018 |           1 | Prior   |     1 | A          |
|  1 | 01/08/2018 |           2 | Prior   |     3 | A          |
|  1 | 01/15/2018 |           3 | Prior   |     3 | A          |
|  1 | 01/29/2018 |           5 | Prior   |     4 | A          |
|  1 | 01/01/2019 |           1 | Current |     2 | A          |
|  1 | 01/08/2019 |           2 | Current |     4 | A          |
|  1 | 01/15/2019 |           3 | Current |     1 | A          |
|  1 | 01/22/2019 |           4 | Current |     1 | A          |
|  1 | 01/01/2018 |           1 | Prior   |     1 | B          |
|  1 | 01/08/2018 |           2 | Prior   |     3 | B          |
|  1 | 01/15/2018 |           3 | Prior   |     3 | B          |
|  1 | 01/29/2018 |           5 | Prior   |     4 | B          |
|  1 | 01/01/2019 |           1 | Current |     2 | B          |
|  1 | 01/08/2019 |           2 | Current |     4 | B          |
|  1 | 01/15/2019 |           3 | Current |     1 | B          |
|  1 | 01/22/2019 |           4 | Current |     1 | B          |
+----+------------+-------------+---------+-------+------------+

缺少行的预期结果:

+----+------------+-------------+---------+-------+------------+
| ID | Week_Start | Week_Number |  Year   | Sales | Sales_Type |
+----+------------+-------------+---------+-------+------------+
|  1 | 01/01/2018 |           1 | Prior   |     1 | A          |
|  1 | 01/08/2018 |           2 | Prior   |     3 | A          |
|  1 | 01/15/2018 |           3 | Prior   |     3 | A          |
|  1 | 01/22/2018 |           4 | Prior   |     0 | A          |
|  1 | 01/29/2018 |           5 | Prior   |     4 | A          |
|  1 | 01/01/2019 |           1 | Current |     2 | A          |
|  1 | 01/08/2019 |           2 | Current |     4 | A          |
|  1 | 01/15/2019 |           3 | Current |     1 | A          |
|  1 | 01/22/2019 |           4 | Current |     1 | A          |
|  1 | 01/29/2019 |           5 | Current |     0 | A          |
|  1 | 01/01/2018 |           1 | Prior   |     1 | B          |
|  1 | 01/08/2018 |           2 | Prior   |     3 | B          |
|  1 | 01/15/2018 |           3 | Prior   |     3 | B          |
|  1 | 01/22/2018 |           4 | Prior   |     0 | B          |
|  1 | 01/29/2018 |           5 | Prior   |     4 | B          |
|  1 | 01/01/2019 |           1 | Current |     2 | B          |
|  1 | 01/08/2019 |           2 | Current |     4 | B          |
|  1 | 01/15/2019 |           3 | Current |     1 | B          |
|  1 | 01/22/2019 |           4 | Current |     1 | B          |
|  1 | 01/29/2019 |           5 | Current |     0 | B          |
+----+------------+-------------+---------+-------+------------+

我尝试在ICS中使用转换,但是没有一个转换可以完成我要尝试执行的操作。关于如何执行此操作,我最好的猜测是通过在SQL中使用递归CTE并拉入SQL脚本来生成这些缺失的行。

我的问题是,如何在几个分区上执行此操作?这不仅是我感兴趣的约会日期,还是两年和几种不同类型的销售的约会日期。 Week_Start列包含混合数据这一事实使情况更加复杂。我早期的尝试最终生成了2018年某个日期和2019年一个数据之间的所有行。

1 个答案:

答案 0 :(得分:1)

使用cross join生成行,使用left join引入值:

select w.week_start, w.week_number, ys.year, ys.sales_type,
       coalesce(t.sales, 0) as sales
from (select distinct week_start, week_number from t) w cross join
     (select distinct year, sales_type from t) ys left join
     t
     on t.week_start = w.week_start and
        t.year = ys.year and
        t.sales_type = ys.sales_type;