按群组转移价值

时间:2017-08-09 07:40:34

标签: r

我有一个不平衡(每个观察单位的观测数量不等)面板数据集。该数据集包含学生的个人数据及其机构隶属关系。关键信息是他们是否达到学位或退学。

在下面的示例数据集(df_raw)中,有两个变量描述学生是否达到学位。我的挑战是,学位并非在所有情况下都在正确的学期注册(由于管理延迟。滞后可以达到2/3个学期)。因此,示例数据集中的学生1被注册以获得2012年春季学位。我希望将该值1移至2011年,因为那是学位完成的时间。

学生2在完成哈佛大学学位后改变了机构。学位于2012年春季注册,当时学生开始在斯坦福大学学习。在这种情况下,我想把这个值移到哈佛大学的最后一个学期。

想要的结果在df_complete(下面)中说明。我的数据集包含超过30 000名学生和20万个观察。同样:数据集是不平衡的,并且学位注册的滞后因具体情况而异。

请注意:哈佛和斯坦福仅作为示例使用,因为大多数读者都知道它们。

谢谢。

df_raw <- data.frame(
student = c(1,1,1,1,2,2,2,2,2,2),
year = c(2010,2011,2011,2012,2010,2011,2011,2012,2012,2013),
semester = 
c("Fall","Spring","Fall","Spring","Fall","Spring","Fall","Spring","Fall",   
"Spring"),
institution = c("Stanford","Stanford","Stanford","Stanford","Harvard","Harvard","Harvard","Stanford","Stanford","Stanford"),
level = c("Lower degree","Lower degree","Lower degree","Higher 
degree","Lower degree","Lower degree","Lower degree","Lower degree","Lower 
degree","Lower degree"),
degree_same = c(0,0,0,1,0,0,0,0,0,0),
degree_other = c(0,0,0,0,0,0,0,1,0,0))

df_complete <- data.frame(
student = c(1,1,1,1,2,2,2,2,2,2),
year = c(2010,2011,2011,2012,2010,2011,2011,2012,2012,2013),
semester = c("Fall","Spring","Fall","Spring","Fall","Spring","Fall","Spring","Fall",    "Spring"),
institution = c("Stanford","Stanford","Stanford","Stanford","Harvard","Harvard","Harvard","Stanford","Stanford","Stanford"),
level = c("Lower degree","Lower degree","Lower degree","Higher degree","Lower degree","Lower degree","Lower degree","Lower degree","Lower degree","Lower degree"),
degree_same = c(0,0,1,0,0,0,0,0,0,0),
degree_other = c(0,0,0,0,0,0,1,0,0,0))

0 个答案:

没有答案