`
当前数据集 - 具有日期,ID和值
Expected Result
Date | ID | Value
--------------------
2/4/17 | 3 | 4.4
2/4/17 | 9 | 6.2
2/5/17 | 3 | 4.4
2/5/17 | 9 | 6.2
2/6/17 | 3 | 4.4
2/6/17 | 9 | 6.2
2/7/17 | 3 | 4.4
2/7/17 | 9 | 6.2
2/8/17 | 3 | 4.4
2/8/17 | 9 | 6.2
2/9/17 | 3 | 4.7
2/9/17 | 4 | 7.4
2/9/17 | 9 | 9.4
2/10/17 | 3 | 4.7
2/10/17 | 4 | 7.4
2/10/17 | 9 | 9.4
2/11/17 | 3 | 9.7
2/11/17 | 7 | 12.4
`预期结果 - 我想填充缺少的日期并查看每个ID及其值,直到下一个日期。
<error-page>
<error-code>404</error-code>
<location>/error.jsp</location>
</error-page>
答案 0 :(得分:2)
在遇到这种情况时,我通常使用类似下面的脚本。
基本上即时#&#34;填充&#34;缺少日期但交叉连接所有可能的日期将显示所有可能的ID
值,然后检查以前的Value
值。
请注意,交叉加入会生成大表格 - 例如,有10个Dates
和10个ID
会产生包含100行,20 Dates
和20 ID
的表格 - 400行等,如果我们有太多的历史和太多的ID ......事情会很快变得令人讨厌(密切关注内存消耗)
// Load temp data
// just to be sure convert the original Date field to date
Data_Temp:
Load
date(date#(Date, 'M/D/YY')) as Date,
ID,
Value
;
Load * Inline [
Date , ID , Value
2/4/17 , 3 , 4.4
2/4/17 , 9 , 6.2
2/9/17 , 3 , 4.7
2/9/17 , 4 , 7.4
2/9/17 , 9 , 9.4
2/11/17 , 3 , 9.7
2/11/17 , 7 , 12.4
];
// Autogenerate all dates
// between min and max date
// from Data_Temp table
Dates:
Load
Date(MinDate+IterNo()-1) as Date
While
MinDate+IterNo()-1<=MaxDate;
;
Load
Min(Date) as MinDate,
Max(Date) as MaxDate
Resident
Data_Temp
;
// Join the distinct ID to the Dates table
// the result table will be cross join
// all dates for all IDs
join
Load
distinct
ID
Resident
Data_Temp
;
// join the "cross table" to the data table
// The result table will have the original data
// plus all missing dates per ID
join (Data_Temp)
Load * Resident Dates;
// we dont need this anymore
Drop Table Dates;
// This is the fun part ...
// 1. load the Data_Temp table and order it by ID and Date
// 2. for each row:
// 2.1. check if the current ID value is not equal to the previous row ID value
// if "yes" (new ID set) get the current Value
// if "no" check if the current value is > 0
// if "yes" keep the current value
// if "no" get the peek value of the current field (NewValue)
Data:
// // Uncomment the script section below
// // if we need to exclude the rows with NewValue = null
//Load
// *
//where
// NewValue > 0
//;
Load
ID,
Date,
Value,
if( ID <> peek(ID), Value,
if( Value > 0, Value, peek(NewValue) )
) as NewValue
Resident
Data_Temp
Order By
ID,
Date
;
// we dont need this anymore
Drop Table Data_Temp;
// // At this point we can drop the old Value field
// // and rename the NewValue to Value ... if needed
//Drop Field Value;
//Rename Field NewValue to Value;
运行脚本后,结果就是:
您可以下载示例QV文件here