删除没有虚拟变量所需的值顺序的个人? (面板数据)

时间:2016-04-22 13:09:00

标签: r

我有面板数据,并且只想保留t = 1时x = 0且t = 2时x = 1的个体,以便:

df <- data.frame(
    ID = c(1,1,2,2,3,3,4,4), 
    time = c(1,2,1,2,1,2,1,2), 
    x = c(0,1,0,0,1,1,1,0)
)
  ID time x
1  1    1 0
2  1    2 1
3  2    1 0
4  2    2 0
5  3    1 1
6  3    2 1
7  4    1 1
8  4    2 0 

变为:

  ID time x
1  1    1 0
2  1    2 1

试图获得它但不能成功。

1 个答案:

答案 0 :(得分:1)

我扩展了您的示例数据,以包含更符合ID 1不符合条件的情况。您可以使用库dplyr和分组过滤来执行此操作,如下所示:

df <- rbind(df, data.frame(ID = c(1, 1), time = c(2, 1), x = c(0, 1)))
df
   ID time x
1   1    1 0
2   1    2 1
3   2    1 0
4   2    2 0
5   3    1 1
6   3    2 1
7   4    1 1
8   4    2 0
9   1    2 0
10  1    1 1

# First, get all IDs where both conditions are present
df <- df %>% group_by(ID) %>% filter(any(time == 1 & x == 0) & any(time == 2 & x == 1))
df
Source: local data frame [4 x 3]
Groups: ID [1]

     ID  time     x
  (dbl) (dbl) (dbl)
1     1     1     0
2     1     2     1
3     1     2     0
4     1     1     1

# Filter within those IDs for the specific conditions
df %>% filter((time == 1 & x == 0 | time == 2 & x == 1))
Source: local data frame [2 x 3]
Groups: ID [1]

     ID  time     x
  (dbl) (dbl) (dbl)
1     1     1     0
2     1     2     1