我在R,df中有一个data.table。它看起来像
> seq <- c(200,208, 212, 215, 218, 25,28, 232, 236, 245 , 247, 248, 249,265, 276, 284,298, 2, 12, 13, 17,
152, 154, 159,
66, 69, 74, 81, 88, 91, 93, 94, 95, 96)
> cashreg <- rep(c('c1', 'c2', 'c3'), c(21, 3, 10))
> df <- data.table(seq, cashreg)
> df
seq cashreg
1: 200 c1
2: 208 c1
3: 212 c1
4: 215 c1
5: 218 c1
6: 25 c1
7: 28 c1
8: 232 c1
9: 236 c1
10: 245 c1
11: 247 c1
12: 248 c1
13: 249 c1
14: 265 c1
15: 276 c1
16: 284 c1
17: 298 c1
18: 2 c1
19: 12 c1
20: 13 c1
21: 17 c1
22: 152 c2
23: 154 c2
24: 159 c2
25: 66 c3
26: 69 c3
27: 74 c3
28: 81 c3
29: 88 c3
30: 91 c3
31: 93 c3
32: 94 c3
33: 95 c3
34: 96 c3
我有一个用户定义的系列的最大值:
actual_maximum <- 299
我想在min(maximum_in_series , actual_maximum)
之前和之后得到一个单调的序列。这里maximum_in_series是每个“cashreg”的最大值。
为了通过“cashreg”查找seq的最大值,我正在尝试使用
> df[df[,.I[which.max(seq)], by = cashreg]$V1]
seq cashreg
1: 298 c1
2: 159 c2
3: 96 c3
我想删除这些最大值之前和之后的序列号。我正在尝试为每个cummax(seq)
使用cashreg
。
For Example:
,我想应用cummax(seq)直到min(max_series,actual_maximum),这是298,我想删除序列号25和28之外的那么。应计算剩余系列(2,12,13,17)的seq的最大值,在这种情况下,最大值将为17.因此我想对此部分应用cummax(seq)。
应该为每组cashreg完成这个过程。
预期输出看起来像。
seq cashreg
1: 200 c1
2: 208 c1
3: 212 c1
4: 215 c1
5: 218 c1
6: 232 c1
7: 236 c1
8: 245 c1
9: 247 c1
10: 248 c1
11: 249 c1
12: 265 c1
13: 276 c1
14: 284 c1
15: 298 c1
16: 2 c1
17: 12 c1
18: 13 c1
19: 17 c1
20: 152 c2
21: 154 c2
22: 159 c2
23: 66 c3
24: 69 c3
25: 74 c3
26: 81 c3
27: 88 c3
28: 91 c3
29: 93 c3
30: 94 c3
31: 95 c3
32: 96 c3
如何使用R。
中的data.table执行此操作答案 0 :(得分:0)
for( i in unique(df$cashreg)){
#i <- "c1"
cr <- i
df_cashreg <- df[cashreg == cr,]
df1 <- df_cashreg[1:df_cashreg[,.I[which.max(seq)]]]
df2 <- df_cashreg[(df_cashreg[,.I[which.max(seq)]]+1) : nrow(df_cashreg)]
df1 <- df1[, .SD[seq == cummax(seq)],cashreg]
df2 <- df2[, .SD[seq == cummax(seq)],cashreg]
df_combined <- rbind(df1, df2)
if(file.exists(file.path(path2data,"temp_cashreg_clean.txt"))){
write.table(df_combined, file.path(path2data, "temp_cashreg_clean.txt"),
row.names=FALSE, col.names=FALSE,append = TRUE, sep="\t", quote = FALSE)
}else{
write.table(df_combined, file.path(path2data, "temp_cashreg_clean.txt"),
row.names=FALSE, col.names=TRUE,sep="\t", quote = FALSE)
}
}
df <- fread(file.path(path2data, "temp_cashreg_clean.txt"), colClasses =
c(cashreg = "character"))
df <- unique(df)
file.remove(file.path(path2data, "temp_cashreg_clean.txt"))