Question

我有一个问题：假设这是我的数据的样子：

Num condition     y
1   a   1
2   a   2
3   a   3
4   b   4
5   b   5
6   b   6
7   c   7
8   c   8
9   c   9
10  b   10
11  b   11
12  b   12

我现在想在b上进行计算（例如，均值），这取决于值是否在b之前的行中，在这个例子中是a还是c？谢谢你的帮助！！！安格

Answer 1

这是你想要的吗？

# in order to separate between different runs of condition 'b',
# get length and value of runs of equal values of 'condition'
rl <- rle(x = df$condition)
df$run <- rep(x = seq_len(length(rl$lengths)), times = rl$lengths)

# calculate sum of y, on data grouped by condition and run, and where condition is 'b'
aggregate(y ~ condition + run, data = df, subset = condition == "b", sum)

Answer 2

您可以使用

向数据框添加“滞后”条件列（假设为DF）

> DF <- within(DF, lag_cond <- c(NA, head(as.character(condition), -1)))

结果：

   Num condition  y lag_cond
     1         a  1     <NA>
     2         a  2        a
     3         a  3        a
     4         b  4        a
     5         b  5        b
     6         b  6        b
     7         c  7        b
     8         c  8        c
     9         c  9        c
    10         b 10        c
    11         b 11        b
    12         b 12        b

现在，您可以识别出您想要的行：

> DF[with(DF, condition=="b" & lag_cond %in% c("a","c")),]
   Num condition  y lag_cond
     4         b  4        a
    10         b 10        c

R基于前一行中的值构建子集

2 个答案: