我有数据框df
,我希望根据分类中的数字序列对df
进行分组。
x <- c(1,2,3,4,5,7,9,11,13)
x2 <- x+77
df <- data.frame(x=c(x,x2),y= c(rep("A",9),rep("B",9)))
df
x y
1 1 A
2 2 A
3 3 A
4 4 A
5 5 A
6 7 A
7 9 A
8 11 A
9 13 A
10 78 B
11 79 B
12 80 B
13 81 B
14 82 B
15 84 B
16 86 B
17 88 B
18 90 B
我只希望x
增加1的行而不是x
增加2的行:例如
x y
1 1 A
2 2 A
3 3 A
4 4 A
5 5 A
10 78 B
11 79 B
12 80 B
13 81 B
14 82 B
我想我必须在元素之间进行一些减法,并检查差异是否为>1
并将其与ddply
结合起来,但这看起来很麻烦。我缺少某种sequence
功能吗?
答案 0 :(得分:3)
使用diff
df[which(c(1,diff(df$x))==1),]
答案 1 :(得分:2)
你的例子似乎表现得很好,可以通过@ agstudy的答案很好地处理。如果你的数据有一天起作用,那么......
myfun <- function(d, whichDiff = 1) {
# d is the data.frame you'd like to subset, containing the variable 'x'
# whichDiff is the difference between values of x you're looking for
theWh <- which(!as.logical(diff(d$x) - whichDiff))
# Take the diff of x, subtract whichDiff to get the desired values equal to 0
# Coerce this to a logical vector and take the inverse (!)
# which() gets the indexes that are TRUE.
# allWh <- sapply(theWh, "+", 1)
# Since the desired rows may be disjoint, use sapply to get each index + 1
# Seriously? sapply to add 1 to a numeric vector? Not even on a Friday.
allWh <- theWh + 1
return(d[sort(unique(c(theWh, allWh))), ])
}
> library(plyr)
>
> ddply(df, .(y), myfun)
x y
1 1 A
2 2 A
3 3 A
4 4 A
5 5 A
6 78 B
7 79 B
8 80 B
9 81 B
10 82 B