客户在R中的3个月和12个月移动平均线

时间:2013-09-03 13:56:06

标签: r

对于R来说相当新,并且有一个问题:移动平均值/滚动平均值

我在R中有一个看起来像这样的表:

CustomerID | ProductID | FiscalPeriod | Volume  | GP
ABC123        987654     January 2013   $10,000  $3,000
ABC123        987654     February 2013  $500     $200   
ABC123        987654     March 2013     $6,000   $2,000
XYZ555        123456     January 2013   $550     $150
XYZ555        123456     February 2013  $4,000   $800
...

每个客户/产品组合都有一个FiscalPeriod Volume / GP值(即使它为零)。

最终,我想要一个看起来像的输出:

CustomerID | ProductID | 3MthMovAvgVol| 12MthMovAvgVol| 3MthMovAvgGP | 12MthMovAvgGP
ABC123       987654      $7,500         $8,250          $1,750         $1,950
XYZ555       123456      $3,500         $4,650          $600           $800

目标是通过客户和产品组合为量和GP提供3米/ 12米移动平均值/滚动平均值。

我一直在看动物园和ttr,但似乎无法在任何地方找到如何执行。

有什么想法吗?

1 个答案:

答案 0 :(得分:0)

## You can use odbc package.For example, you put data in d:/test.xls.
install.packages('RODBC')
library('RODBC')
channel=odbcConnectExcel("d:/test.xls")
## I do not know what 3-months moving average means, you can set conditions using "where" in this SQL below
sqlQuery(channel,'select  a.CustomerID,a.avgVol_12, a.avgGP_12 , b.AvgVol_3 ,b.AvgGP_3   from 
(select CustomerID,sum(Volume)/12 as  avgVol_12,sum(GP)/12 as avgGP_12 from [Sheet1$] group by CustomerID) a  
left outer  join  
(select CustomerID,sum(Volume)/3 as AvgVol_3, sum(GP)/3 as AvgGP_3  from [Sheet1$] where month(FiscalPeriod)>=3 group by CustomerID ) b  
on a.CustomerID=b.CustomerID  ' )