我有一个data.frame,其中包含来自眼睛跟踪软件的原始数据:
零线代表用户眨眼的时间。在眨眼前后的一小段时间内,眼睛跟踪设备的校准不准确。因此,我想删除与闪烁对应的行(具有0个值)以及每次闪烁之前和之后的4行。
您可以重新创建类似的df:
test <- data.frame(a = sample(0:200, 200, replace = T),
b = sample(0:200, 200, replace = T),
c = sample(0:200, 200, replace = T),
d = sample(0:200, 200, replace = T))
test[50:100, ] <- 0
答案 0 :(得分:1)
普通R解决方案。
在您的示例数据集之后。首先,一个向量表示零在哪里:
> zeros <- rowSums(test) == 0
> zeros
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[49] FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[61] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[73] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[85] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[97] TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[109] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[121] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[133] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[145] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[157] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[169] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[181] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[193] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
我们添加一些虚假的虚假行,以便每个实际行在前后至少包含四个项目:
> zeros <- c(F, F, F, F, F, zeros, F, F, F, F)
然后,在9个窗口(前四行,所考虑的行,后四行)的窗口上计算滚动总和:
> rolling <- diff(cumsum(zeros), 9)
> rolling
[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[38] 0 0 0 0 0 0 0 0 1 2 3 4 5 6 7 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8
[75] 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 7 6 5 4 3 2 1 0 0 0 0 0 0 0 0
[112] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[149] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[186] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
删除具有相邻零行的非零值的行:
> output <- test[rolling == 0, ]
> rownames(output)
[1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10" "11" "12" "13" "14" "15" "16" "17" "18" "19" "20" "21" "22"
[23] "23" "24" "25" "26" "27" "28" "29" "30" "31" "32" "33" "34" "35" "36" "37" "38" "39" "40" "41" "42" "43" "44"
[45] "45" "105" "106" "107" "108" "109" "110" "111" "112" "113" "114" "115" "116" "117" "118" "119" "120" "121" "122" "123" "124" "125"
[67] "126" "127" "128" "129" "130" "131" "132" "133" "134" "135" "136" "137" "138" "139" "140" "141" "142" "143" "144" "145" "146" "147"
[89] "148" "149" "150" "151" "152" "153" "154" "155" "156" "157" "158" "159" "160" "161" "162" "163" "164" "165" "166" "167" "168" "169"
[111] "170" "171" "172" "173" "174" "175" "176" "177" "178" "179" "180" "181" "182" "183" "184" "185" "186" "187" "188" "189" "190" "191"
[133] "192" "193" "194" "195" "196" "197" "198" "199" "200"
如果有这样的愿望,显然可以将其包裹在某些dplyr
的{{1}}中。
编辑:修正了一对一的冲突。
答案 1 :(得分:0)
library(purrr)
vec0s <- which(df$GLX == 0) #vector of zeros
indexToRemove <- map2(vec0s-4, vec0s+4, function(minVal, maxVal) {
minVal:maxVal %>% #vector of 4 below to 4 above each row with 0
.[minVal:maxVal > 0] #remove any negative numbers
}) %>% unlist() %>% unique() #select unique numbers in cases of overlap
df <- df[-indexToRemove,] #Remove from df