假设我有足够大的数据帧,大约有一百万行
我想删除数据框中BSM和ENDBSM之间的行,如何有效地做到这一点?
我想先用1标记行,我需要使用以下循环来提取这些行,但是这要花很多时间。
chkSTR = 0
for(i in 1:nrow(rDATA)){
if(rDATA$Data[i] == "BSM"){
chkSTR = 1
}
if(rDATA$Data[i] == "ENDBSM"){
chkSTR = 0
}
rDATA$BOOL[i] = chkSTR
}
输入数据帧示例
rData = data.frame(
Data =
c(1,"BSM","a",3,3,"ENDBSM",1,3,1,"BSM","b",3,3,"ENDBSM",1,2,1,"BSM","c",2,3,"ENDBSM",1,2)
)
Output example
rData = data.frame(
Data =
c("BSM","a",3,3,"ENDBSM","BSM","b",3,3,"ENDBSM","BSM","c",2,3,"ENDBSM")
)
答案 0 :(得分:4)
正如评论中提到的,"BSM"
中"ENDBSM"
的数目是相同的,并且"BSM"
总是最先出现的,我们可以使用mapply
并在子集的索引之间创建一个序列。
rData[c(mapply(`:`, which(rData$Data == "BSM"),
which(rData$Data == "ENDBSM"))), , drop = FALSE]
# Data
#2 BSM
#3 a
#4 3
#5 3
#6 ENDBSM
#10 BSM
#11 b
#12 3
#13 3
#14 ENDBSM
#18 BSM
#19 c
#20 2
#21 3
#22 ENDBSM
答案 1 :(得分:1)
我们可以使用map2
中的purrr
library(purrr)
map2(which(rData$Data == "BSM"), which(rData$Data == "ENDBSM"), `:`) %>%
flatten_int %>%
extract2(rData, ., )
答案 2 :(得分:1)
您可以使用let config = {
headers: {
Authorization: `Basic aaaa:xxxx`
)}`
},
......
};
const response = await fetch(url, config);
在BSM和ENDBSM之间制作一个触发器。不需要BSM和ENDBSM的数目相同,也不需要BSM在前。当BSM出现时,它很容易打开,而ENDBSM出现时,它很简单。
Reduce
如果您想摆脱周围的BSM和ENDBSM,可以执行以下操作:
idx <- Reduce(function(y,x) {(y || x=="BSM") && x!= "ENDBSM"}, x=rData$Data, init=FALSE, accumulate=TRUE)
rData[idx[-1] | idx[-length(idx)], , drop = FALSE]
# Data
#2 BSM
#3 a
#4 3
#5 3
#6 ENDBSM
#10 BSM
#11 b
#12 3
#13 3
#14 ENDBSM
#18 BSM
#19 c
#20 2
#21 3
#22 ENDBSM