在客户ID和产品级别滚动R中的数据

时间:2016-04-14 12:14:37

标签: r aggregate reshape summary

我有格式的数据: -

id     product   mcg      txn
101    gold      hotel    1
101     gold      hotel   2
101     clas      hotel   22
101     clas       airline 23

我希望输出为

           hotel_txn    airline_txn
101 gold   3              .
101 clas   22             23

任何人都可以帮助我获得所需的输出吗?

基本上我正在寻找SAS中的Case when语句的替代方法吗?

3 个答案:

答案 0 :(得分:2)

我们可以使用xtabs

 xtabs(txn~idprod + mcg, transform(df1, idprod = paste(id, product),
              mcg = paste0(mcg, "_txn")))
 #         mcg
 #idprod     airline_txn hotel_txn
 # 101 clas          23        22
 # 101 gold           0         3

答案 1 :(得分:1)

Reshape2的dcast功能专为此类设计而设计:

#creates your data frame
df <- data.frame(id = c(101, 101, 101, 101),
                 product = c("gold", "gold", "clas", "clas"),
                 mcg = c("hotel", "hotel", "hotel", "airline"),
                 txn = c(1, 2, 22, 23))

#installs and loads the required package
install.packages("reshape2")
library(reshape2)

#the function you would use to create the new data frame
df2 <- dcast(df, id + product ~ mcg, value.var = "txn", sum)

print(df2)
   id product airline hotel
1 101    clas      23    22
2 101    gold       0     3

答案 2 :(得分:0)

您可以使用dplyrtidyr执行此操作:

library(dplyr)
library(tidyr)
df %>% group_by(id, product, mcg) %>% summarise(txn = sum(txn)) %>% spread(mcg, txn)
Source: local data frame [2 x 4]
Groups: id, product [2]

     id product airline hotel
  <int>  <fctr>   <int> <int>
1   101    clas      23    22
2   101    gold      NA     3