从数据框1聚合列,然后插入到数据框2

时间:2018-09-05 05:29:40

标签: r

我正在使用R 3.5.1。

我有以下数据框:销售货运

> Freight = data.frame(CustomerId = c("A","A","B","B","C","C"), Product = c("1","2","1","2","1","2"), Cost=c(207,409,116,335,42,222))
> 
> Freight
  CustomerId Product Cost
1          A       1  207
2          A       2  409
3          B       1  116
4          B       2  335
5          C       1   42
6          C       2  222


> Sales_DF <- data.frame(CustomerID = c("A" ,"A","A","A","B","B","B","B","C","C","C","C"), TransactionDate = c("2/14/2018","1/7/2018","2/22/2018","1/14/2018","2/10/2018","1/13/2018","2/3/2018","1/14/2018", "2/19/2018","1/9/2018","2/20/2018","1/23/2018"),Shipment=c(176,54,175,60,118,262,257,470,474,438,82,305))
> Sales_DF
   CustomerID TransactionDate
1           A       2/14/2018
2           A        1/7/2018
3           A       2/22/2018
4           A       1/14/2018
5           B       2/10/2018
6           B       1/13/2018
7           B        2/3/2018
8           B       1/14/2018
9           C       2/19/2018
10          C        1/9/2018
11          C       2/20/2018
12          C       1/23/2018
   Shipment
1       176
2        54
3       175
4        60
5       118
6       262
7       257
8       470
9       474
10      438
11       82
12      305

如何为每个客户和每个月在销售中汇总装运,并将其作为列插入货运

想要输出。

Customer    Month   Cost    Shipment
A             1      207    114
A             2      409    351
B             1      116    732
B             2      335    375
C             1       42    743
C             2      222    556

1 个答案:

答案 0 :(得分:3)

这是基本的R方法:

#convert the TransactionDate into month for joining with Freight
Sales_DF$Product <- as.POSIXlt(Sales_DF$TransactionDate, format="%m/%d/%Y")$mon + 1L

#merge Freight with aggregated Shipment
merge(Freight,
    #aggregate Shipment by CustomerID and Product
    aggregate(Shipment ~ CustomerID + Product, Sales_DF, sum), 
    by=c("CustomerID", "Product"))

数据:

Freight <- data.frame(CustomerID = c("A","A","B","B","C","C"), Product = c("1","2","1","2","1","2"), Cost=c(207,409,116,335,42,222))
Sales_DF <- data.frame(CustomerID = c("A" ,"A","A","A","B","B","B","B","C","C","C","C"), TransactionDate = c("2/14/2018","1/7/2018","2/22/2018","1/14/2018","2/10/2018","1/13/2018","2/3/2018","1/14/2018", "2/19/2018","1/9/2018","2/20/2018","1/23/2018"),Shipment=c(176,54,175,60,118,262,257,470,474,438,82,305))