我在R中有一个看起来像这样的表:
CustomerID | ProductID | FiscalPeriod | Volume | GP
ABC123 987654 January 2013 $10,000 $3,000
ABC123 987654 February 2013 $500 $200
ABC123 987654 March 2013 $6,000 $2,000
XYZ555 123456 January 2013 $550 $150
XYZ555 123456 February 2013 $4,000 $800
...
每个客户/产品组合都有一个FiscalPeriod Volume / GP值(即使它为零)。
最终,我想要一个看起来像的输出:
CustomerID | ProductID | 3MthMovAvgVol| 12MthMovAvgVol| 3MthMovAvgGP | 12MthMovAvgGP
ABC123 987654 $7,500 $8,250 $1,750 $1,950
XYZ555 123456 $3,500 $4,650 $600 $800
目标是通过客户和产品组合为量和GP提供3米/ 12米移动平均值/滚动平均值。
我一直在看动物园和ttr,但似乎无法在任何地方找到如何执行。
有什么想法吗?
答案 0 :(得分:0)
## You can use odbc package.For example, you put data in d:/test.xls.
install.packages('RODBC')
library('RODBC')
channel=odbcConnectExcel("d:/test.xls")
## I do not know what 3-months moving average means, you can set conditions using "where" in this SQL below
sqlQuery(channel,'select a.CustomerID,a.avgVol_12, a.avgGP_12 , b.AvgVol_3 ,b.AvgGP_3 from
(select CustomerID,sum(Volume)/12 as avgVol_12,sum(GP)/12 as avgGP_12 from [Sheet1$] group by CustomerID) a
left outer join
(select CustomerID,sum(Volume)/3 as AvgVol_3, sum(GP)/3 as AvgGP_3 from [Sheet1$] where month(FiscalPeriod)>=3 group by CustomerID ) b
on a.CustomerID=b.CustomerID ' )