我有一个Customer_ID的SQL表,显示按年份付款。第一个(许多)客户显示如下:
ID Payment Year
112 0 2004
112 0 2005
112 0 2006
112 9592 2007
112 12332 2008
112 9234 2011
112 5400 2012
112 7392 2014
112 8321 2015
请注意,有些年份不见了。我需要为每行创建10个新列,以显示前10年的付款。结果表应如下所示:
ID Payment Year T-1 T-2 T-3 T-4 T-5 T-6 T-7 T-8 T-9 T-10
112 0 2004 NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL
112 0 2005 0 NULL NULL NULL NULL NULL NULL NULL NULL NULL
112 0 2006 0 0 NULL NULL NULL NULL NULL NULL NULL NULL
112 952 2007 0 0 0 NULL NULL NULL NULL NULL NULL NULL
112 1232 2008 952 0 0 0 NULL NULL NULL NULL NULL NULL
112 924 2011 NULL NULL 1232 952 0 0 0 NULL NULL NULL
112 500 2012 924 NULL NULL 1232 952 0 0 0 NULL NULL
112 392 2014 NULL 500 924 NULL NULL 1232 952 0 0 0
112 821 2015 392 NULL 500 924 NULL NULL 1232 952 0 0
我很清楚这是数据的大量重复,因此似乎很奇怪。但是,我仍然想这样做! (数据正在为预测模型准备,在该模型中,以前的付款(和其他信息)将用于预测当年的付款)
我不太确定从哪里开始。我一直在寻找使用数据透视的方法,但无法弄清楚如何从客户的前一年中选择价值。
我非常想在SQL中执行此操作。如果那不可能,我可以将表复制到R中-但我偏爱使用SQL。
非常感谢任何帮助。
答案 0 :(得分:1)
如果您有完整的数据,则可以使用lag()
:
select t.*,
lag(payment, 1) over (partition by id order by year) as t_1,
lag(payment, 2) over (partition by id order by year) as t_2,
. . .
from t;
但是,对于缺少中间年份的情况,left join
可能会更简单:
select t.*,
t1.payment as t_1,
t2.payment as t_2,
. . .
from t left join
t t1
on t1.id = t.id and
t1.year = t.year - 1 left join
t t2
on t1.id = t.id and
t1.year = t.year - 2 left join
. . .;
答案 1 :(得分:0)
我想你的朋友将成为LAG
这是一个实现:
Declare @t table (
ID int,
Payment int,
Yr int
)
Insert Into @t Values(112,0,2004)
Insert Into @t Values(112,0,2005)
Insert Into @t Values(112,0,2006)
Insert Into @t Values(112,9592,2007)
Insert Into @t Values(112,12332,2008)
Insert Into @t Values(112,9234,2011)
Insert Into @t Values(112,5400,2012)
Insert Into @t Values(112,7392,2014)
Insert Into @t Values(112,8321,2015)
Insert Into @t Values(113,0,2009)
Insert Into @t Values(113,9234,2011)
Insert Into @t Values(113,5400,2013)
Insert Into @t Values(113,8321,2015)
;with E1(n) as (Select 1 Union All Select 1 Union All Select 1 Union All Select 1 Union All Select 1 Union All Select 1 Union All Select 1 Union All Select 1 Union All Select 1 Union All Select 1)
,E2(n) as (Select 1 From E1 a, E1 b)
,E4(n) as (Select 1 From E2 a, E2 b)
,E5(n) as (Select row_number() over(order by isnull(null,1)) From E4 a, E1 b)
,IDYears as (
Select z.ID, Yr = y.n
From (
Select
Id,
MinYear = min(Yr),
MaxYear = max(Yr)
From @t a
Group By Id
) z
Inner Join E5 y On y.n between z.MinYear and z.MaxYear
)
Select
*,
[t-1] = Lag(B.Payment, 1) Over(Partition By a.ID Order By a.Yr),
[t-2] = Lag(B.Payment, 2) Over(Partition By a.ID Order By a.Yr),
[t-3] = Lag(B.Payment, 3) Over(Partition By a.ID Order By a.Yr),
[t-4] = Lag(B.Payment, 4) Over(Partition By a.ID Order By a.Yr),
[t-5] = Lag(B.Payment, 5) Over(Partition By a.ID Order By a.Yr),
[t-6] = Lag(B.Payment, 6) Over(Partition By a.ID Order By a.Yr),
[t-7] = Lag(B.Payment, 7) Over(Partition By a.ID Order By a.Yr),
[t-8] = Lag(B.Payment, 8) Over(Partition By a.ID Order By a.Yr),
[t-9] = Lag(B.Payment, 9) Over(Partition By a.ID Order By a.Yr),
[t-10] = Lag(B.Payment, 10) Over(Partition By a.ID Order By a.Yr)
From IDYears a
Left Join @t b On a.ID = b.ID and a.Yr = b.Yr
Order By A.ID