最近开始使用datatables包,我在查找时遇到了一些麻烦。这是数据:
Date MonthNo Unique Items Amounts Total
1: Jan 1 AAA x 10 10
2: Jan 1 BBB y 2 0
3: Feb 2 CCC x 3 3
4: Feb 2 DDD y 15 0
5: March 3 AAA y 20 0
6: March 3 BBB x 35 35
7: April 4 CCC x 15 15
8: April 4 AAA y 50 0
9: May 5 BBB x 60 60
10: May 5 CCC y 70 0
11: June 6 DDD x 100 100
12: June 6 AAA y 20 0
基本上,我想创建一个名为PYTD的新列,它基本上是每个月每个唯一的总数,但仅限于前一个月。 例如:
Date MonthNo Unique Items Amounts Total PYTD
7: April 4 CCC x 15 3
这是我到目前为止的代码:
Sys.setlocale("LC_CTYPE", "en_US.UTF-8")
library(data.table)
data <- read.csv("sample.csv")
df <- as.data.frame(data)
#str(df)
dt <- data.table(df)
dt
#str(dt)
dt$Total = ifelse(dt$Items == "x",dt$Amounts,0)
dtgrouped2 = dt[, lapply(.SD, sum, na.rm=TRUE), by=list(MonthNo,Unique),
.SDcol=c("Total")]
dtgrouped2$PYTD <- dtgrouped2[MonthNo == (dtgrouped2$MonthNo-1)
& Unique == dtgrouped2$Unique,Total]
但是dtgrouped2 $ PYTD不幸地给了我NAs。
这是我正在寻找的最终结果:
MonthNo Unique Total PYTD
1: 1 AAA 10 NA
2: 1 BBB 0 NA
3: 2 CCC 3 NA
4: 2 DDD 0 NA
5: 3 AAA 0 10
6: 3 BBB 35 0
7: 4 CCC 15 3
8: 4 AAA 0 0
9: 5 BBB 60 35
10: 5 CCC 0 15
11: 6 DDD 100 0
12: 6 AAA 0 0
答案 0 :(得分:0)
在增加计算总和的MonthNo后,您可以将数据与自身合并:
# create fake data
library(data.table)
set.seed(0)
dt <- data.table(MonthNo = rep(1:4, each = 3),
Unique = LETTERS[1:2],
Total = runif(12))
dt
MonthNo Unique Total
1: 1 A 0.89669720
2: 1 B 0.26550866
3: 1 A 0.37212390
4: 2 B 0.57285336
5: 2 A 0.90820779
6: 2 B 0.20168193
7: 3 A 0.89838968
8: 3 B 0.94467527
9: 3 A 0.66079779
10: 4 B 0.62911404
11: 4 A 0.06178627
12: 4 B 0.20597457
dt[, list(PYTD = sum(Total)), by = list(Unique, MonthNo)
][, MonthNo := MonthNo + 1][
dt, on = .(MonthNo, Unique)]
Unique MonthNo PYTD Total
1: A 1 NA 0.89669720
2: B 1 NA 0.26550866
3: A 1 NA 0.37212390
4: B 2 0.2655087 0.57285336
5: A 2 1.2688211 0.90820779
6: B 2 0.2655087 0.20168193
7: A 3 0.9082078 0.89838968
8: B 3 0.7745353 0.94467527
9: A 3 0.9082078 0.66079779
10: B 4 0.9446753 0.62911404
11: A 4 1.5591875 0.06178627
12: B 4 0.9446753 0.20597457