我有3000万条记录(贷款),数据范围为(FROM,TO),我需要为日期范围之间的每个日期创建虚拟记录。
示例数据:
BALANCE EFF_FROM_DT EFF_TO_DT LOAN_NBR PAST_DUE_DT
1000 11/1/2018 11/29/2018 1234 10/29/2018
输出数据:
BALANCE Date EFF_FROM_DT EFF_TO_DT LOAN_NBR PAST_DUE_DT DPD
1000 11/1/2018 11/1/2018 11/29/2018 1234 10/29/2018 2
1000 11/2/2018 11/1/2018 11/29/2018 1234 10/29/2018 3
1000 11/3/2018 11/1/2018 11/29/2018 1234 10/29/2018 4
.
.
.
.
1000 11/29/2018 11/1/2018 11/29/2018 1234 10/29/2018 30
我需要将其放在仪表板中,并能够使用其他维度(例如信用等级等)对数据进行切片,以查看每日过期百分比。我已经开始在Qlikview中执行此操作,它从Netezza提取数据并使用以下脚本在QV内扩展数据。加载2700万条记录(仅过去12个月)并将其扩展到每日记录(3.6亿条记录)需要一个小时。理想情况下,我希望获取此数据超过12个月(至少3年)才能看到趋势,在这种情况下,使用QV会花费太多时间来处理数据。还有其他解决方案吗?可以减少处理时间并让我每天漂洗并重复此过程的能力?
LOAN_HIST:
LOAD BALANCE,
EFF_FROM_DT,
EFF_TO_DT,
LOAN_NBR,
PASTDUE,
Grade
FROM
[D:\QVDOCS\DEV\SOURCE\SHF416749\Examples\Test_data.xls]
(biff, embedded labels, table is Sheet1$);
LOAN_HIST2:
LOAD
*,
Date(EFF_FROM_DT + IterNo() - 1) As Date
While EFF_FROM_DT + IterNo() - 1 <= EFF_TO_DT
;
LOAD *
Resident LOAN_HIST order by LOAN_NBR,EFF_FROM_DT;
drop table LOAN_HIST;
LOAN_HIST3:
load
*,
day(Date) as DayOfMonth,
Date(monthstart(Date), 'MMM-YY') as MonthYear,
((year(Date)*12)+month(Date)) - (((year(PASTDUE)*12)+month(PASTDUE))) as MonthDiff
resident LOAN_HIST2;
drop table LOAN_HIST2;
日历表方法:
DatesData:
LOAD * Inline [
Test_Date
11/1/2018
11/2/2018
11/3/2018
11/4/2018
11/5/2018
11/6/2018
11/7/2018
11/8/2018
11/9/2018
11/10/2018
11/11/2018
11/12/2018
11/13/2018
11/14/2018
11/15/2018
11/16/2018
11/17/2018
11/18/2018
11/19/2018
11/20/2018
11/21/2018
11/22/2018
11/23/2018
11/24/2018
11/25/2018
11/26/2018
11/27/2018
11/28/2018
11/29/2018
11/30/2018
12/1/2018
12/2/2018
12/3/2018
];
ODBC CONNECT TO [NTZ PRD] (XUserId is KbRXeRZGZJMSDZIR, XPassword is DFOcWHZMJDZAUYAHUD);
LOAN_HIST:
SQL SELECT
EFF_FROM_DT,
EFF_TO_DT,
BALANCE,
BRACCT,
PASTDUE
FROM PSAPROD.PSADDS."SHF_DLY_CORE_HSTRY" where
((EFF_FROM_DT >=TO_DATE('$(Today_Date_12mons)','DD-MON-YY') and EFF_FROM_DT <=TO_DATE('$(Today_Date)','DD-MON-YY'))
or
(EFF_TO_DT >=TO_DATE('$(Today_Date_12mons)','DD-MON-YY') and EFF_TO_DT <=TO_DATE('$(Today_Date)','DD-MON-YY'))
or
(EFF_TO_DT >=TO_DATE('31-DEC-9999','DD-MON-YYYY'))) and BALANCE>0
order by BRACCT,EFF_FROM_DT
;
LOAN_HIST2:
LOAD *,
if(EFF_TO_DT='12/31/9999',if(BALANCE=0, EFF_FROM_DT, date(today())),if(BALANCE=0,EFF_FROM_DT,EFF_TO_DT)) as EFF_TO_DT2
Resident LOAN_HIST order by BRACCT,EFF_FROM_DT;
drop table LOAN_HIST;
tabMatch:
IntervalMatch (Test_Date)
LOAD EFF_FROM_DT, EFF_TO_DT2
Resident LOAN_HIST2;
答案 0 :(得分:0)
您是否尝试过基于将数据与日历表连接在一起的视图创建仪表板?
此示例为SAS SQL,对于Netezza则略有不同
data have;
attrib
id balance length=8
from_date to_date due_date format=mmddyy10. informat=mmddyy10.
;input
balance from_date: mmddyy10. to_date: mmddyy10. id due_date: mmddyy10. ; datalines;
500 01/1/2018 2/1/2018 1234 1/15/2018
1000 11/1/2018 11/29/2018 1234 10/29/2018
1500 02/1/2018 3/15/2018 7890 1/15/2018
21000 10/1/2018 11/12/2018 7890 9/30/2018
run;
data calendar;
do date = mdy(1,1,2018) to mdy(12,31,2018);
output;
end;
run;
proc sql;
create view want_view_for_dashboard as
select
have.*
, calendar.date as as_of_date format mmddyy10.
, case
when date > due_date then date-due_date /* or DB datediff function */
end as days_past_due
from
have
cross join
calendar
where
calendar.date between have.from_date and have.to_date
;
quit;