计算R中的月份变化

时间:2016-06-27 13:38:21

标签: r

我有一张包含客户名称,付款月份和支出金额的表格,如下所示:

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
<title>Error 401 Authentication failed. Browser based integrations - to login append '?login-form-required=y' to the url you tried to access.</title>
</head>
<body><h2>HTTP ERROR 401</h2>
<p>Problem accessing /qcbin/rest/domains/Projects/projects/Newgen/defects/. Reason:
<pre>    Authentication failed. Browser based integrations - to login append '?login-form-required=y' to the url you tried to access.</pre></p><hr><i><small>Powered by Jetty://</small></i><hr/>

</body>
</html>

我想计算每个客户的支出月度变化(mom_change)和月度百分比变化(mom_per_change)。期望的输出是 -

c_name  p_month  spend
  ABC    201401   100
  ABC    201402   150
  ABC    201403   230
  DEF    201401   110
  DEF    201402   190
  DEF    201403   300

我尝试使用c_name p_month spend mom_change mom_per_change ABC 201401 100 Blank Blank ABC 201402 150 50 0.5 ABC 201403 230 80 0.533 DEF 201401 110 Blank Blank DEF 201402 190 80 0.727 DEF 201403 300 110 0.578 分别计算每个客户端的更改。问题是大约有10000个客户端,使用循环计算它需要花费大量时间。任何帮助深表感谢。感谢。

3 个答案:

答案 0 :(得分:1)

这可以使用data.tableshift()

dt<-data.table(c_name=c("ABC","ABC","ABC","DEF","DEF","DEF"), 
               pmonth=c(201401,201402,201403,201401,201402,201403), 
               spend=c(100,150,230,110,190,300))

dt[, mom_change := (spend-shift(spend)), by=c_name]

dt[, mom_per_change := (spend-shift(spend))/shift(spend), by=c_name]

dt
   c_name pmonth spend mom_change mom_per_change
1:    ABC 201401   100         NA             NA
2:    ABC 201402   150         50      0.5000000
3:    ABC 201403   230         80      0.5333333
4:    DEF 201401   110         NA             NA
5:    DEF 201402   190         80      0.7272727
6:    DEF 201403   300        110      0.5789474
  

答案 1 :(得分:1)

以下是使用data.table的解决方案,blank替换为NA

library(data.table)
setDT(df)[, `:=` (mom_change = c(NA, diff(spend)), 
                  mom_per_change = round(c(NA, diff(spend))/shift(spend), 3)), .(c_name)]
df
   c_name p_month spend mom_change mom_per_change
1:    ABC  201401   100         NA             NA
2:    ABC  201402   150         50          0.500
3:    ABC  201403   230         80          0.533
4:    DEF  201401   110         NA             NA
5:    DEF  201402   190         80          0.727
6:    DEF  201403   300        110          0.579

答案 2 :(得分:0)

dplyr方法,

library(dplyr)
df %>% 
 group_by(c_name) %>% 
 mutate(mom_change = c(NA, diff(spend)), mom_per_change = (spend - lag(spend))/lag(spend))

#Source: local data frame [6 x 5]
#Groups: c_name [2]

#  c_name p_month spend mom_change  mom_per_change  
#  (fctr)   (int) (int)      (dbl)     (dbl)
#1    ABC  201401   100         NA        NA
#2    ABC  201402   150         50 0.5000000
#3    ABC  201403   230         80 0.5333333
#4    DEF  201401   110         NA        NA
#5    DEF  201402   190         80 0.7272727
#6    DEF  201403   300        110 0.5789474