以下是我的一些数据的示例行:
A B participant trial CURRENT_ID C
0 1 ppt01 45 3 0 #row1
1 0 ppt01 45 4 0 #row2
0 1 ppt01 45 10 0 #row3
0 0 ppt01 45 11 0 #row4
1 0 ppt01 45 12 0 #row5
0 1 ppt01 87 2 0 #row6
1 0 ppt01 87 3 0 #row7
1 1 ppt01 87 4 1 #row8
1 1 ppt01 87 5 1 #row9
0 1 ppt02 55 5 0 #row10
1 0 ppt02 55 6 0 #row11
0 1 ppt02 55 9 0 #row12
1 0 ppt02 55 10 0 #row13
0 1 ppt02 55 11 1 #row14
1 0 ppt02 55 12 0 #row15
我需要按参与者,试用和连续的CURRENT_ID行对数据进行分组。但是,参与者和试验需要考虑连续的CURRENT_ID行,可能需要考虑两次。这里是我需要考虑连续行的示例。如您所见,某些行需要考虑两次(例如,参与者ppt01,试验45,CURRENT_ID 11),以及前一行和后一行:
A B participant trial CURRENT_ID C
0 1 ppt01 45 3 0 #row1
1 0 ppt01 45 4 0 #row2
0 1 ppt01 45 10 0 #row3
0 0 ppt01 45 11 0 #row4
0 0 ppt01 45 11 0 #row4
1 0 ppt01 45 12 0 #row5
0 1 ppt01 87 2 0 #row6
1 0 ppt01 87 3 0 #row7
1 0 ppt01 87 3 0 #row7
1 1 ppt01 87 4 1 #row8
1 1 ppt01 87 4 1 #row8
1 1 ppt01 87 5 1 #row9
0 1 ppt02 55 5 0 #row10
1 0 ppt02 55 6 0 #row11
0 1 ppt02 55 9 0 #row12
1 0 ppt02 55 10 0 #row13
1 0 ppt02 55 10 0 #row13
0 1 ppt02 55 11 1 #row14
0 1 ppt02 55 11 1 #row14
1 0 ppt02 55 12 0 #row15
如何在library(dplyr)
group_by(participant,trial)
中包含CURRENT_ID的连续行?
答案 0 :(得分:0)
不知道如何使用dplyr
,但这是基础R中的方法:
# data
dat <- structure(list(A = c(0L, 1L, 0L, 0L, 1L, 0L, 1L, 1L, 1L, 0L,
1L, 0L, 1L, 0L, 1L), B = c(1L, 0L, 1L, 0L, 0L, 1L, 0L, 1L, 1L,
1L, 0L, 1L, 0L, 1L, 0L), participant = c("ppt01", "ppt01", "ppt01",
"ppt01", "ppt01", "ppt01", "ppt01", "ppt01", "ppt01", "ppt02",
"ppt02", "ppt02", "ppt02", "ppt02", "ppt02"), trial = c(45L,
45L, 45L, 45L, 45L, 87L, 87L, 87L, 87L, 55L, 55L, 55L, 55L, 55L,
55L), CURRENT_ID = c(3L, 4L, 10L, 11L, 12L, 2L, 3L, 4L, 5L, 5L,
6L, 9L, 10L, 11L, 12L), C = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L,
1L, 0L, 0L, 0L, 0L, 1L, 0L)), .Names = c("A", "B", "participant",
"trial", "CURRENT_ID", "C"), row.names = c(NA, -15L), class = "data.frame")
# where can the consecutives start? Only look at those with same trial/participant
idx <- which(diff(dat[,"CURRENT_ID"])==1)
idx <- Filter(function(i) dat[i,"trial"]==dat[i+1,"trial"], idx)
idx <- Filter(function(i) dat[i,"participant"]==dat[i+1,"participant"], idx)
# create the dataframes
lapply(idx, function(i) dat[c(i,i+1),])