我想生成以下内生滞后(Y)变量
set Y=1 in the current routine year, if submission==1 and routineyear==1 in the previous routine year
set Y=2 in the current routine year, if sub==0 and routineyear==1 in the previous routine year
Otherwise=0
注意那个"以前的常规年份"不是前一年,例行年份之间的间隔不同。这实际上是让我很难生成这个变量的原因。
基本上,我想生成一个内生变量,用于捕获状态在最后routineyear
的行为。
说明我想做的事情:
假设A国的常规年份为 1990 - 同一年submission
变量也为=1
。这将生成Y=1
。
现在,国家/地区的下一个routineyear
位于 1992 ,即该年的submission=1
和routineyear=1
。内在滞后应表明A的先前行为,如 1990 (Y=1
)。
然后,下一个routineyear
位于 1996 ,其中submission=0
为routineyear=1
。在这种情况下,内生滞后将是A 1992 (Y=1
)中之前行为的值。
然后,下一个routineyear
位于 1998 ,where submission=1
和routineyear=1
。此处的内生滞后应表明A在 1996 中的最后一个routineyear
中的先前行为。那是:Y=2
!
内源滞后应该是这样的(基于上面的例子)
country year submission routineyear Y(endo lag)
A 1990 1 1 1
A 1991 0 0 0
A 1992 1 1 1
A 1993 1 0 0
A 1994 0 0 0
A 1995 0 0 0
A 1996 0 1 1
A 1997 0 0 0
A 1998 1 1 2
A 1999 0 0 0
A 2000 0 0 0
A 2001 0 1 1
A 2002 0 0 0
A 2003 1 1 2
我一直试图使用不同的逻辑来做到这一点,但没有成功。其中一个最大的问题是每个国家的常规年份不同,间隔时间不稳定。
我相信能够在R中编写正确的代码/功能的人能够解决这个难题。如果没有,我将感谢所有建议如何从这里开始。
来自我的真实数据的样本:
结构(列表(ccode = c(31L,31L,31L,31L,31L,31L,31L,31L,31L, 31L,31L,31L,31L,31L,31L,31L,31L,31L,31L,31L,31L,31L,40L, 40L,40L,40L,40L,40L,40L,40L,40L,40L,40L,40L,40L,40L,40L, 40L,40L,40L,40L,40L,40L,40L,41L,41L,41L,41L,41L,41L,41L, 41L,41L,41L,41L,41L,41L,41L,41L,41L,41L,41L,41L,41L,41L, 41L,42L,42L,42L,42L,42L,42L,42L,42L,42L,42L,42L,42L,42L, 42L,42L,42L,42L,42L,42L,42L,42L,42L,51L,51L,51L,51L,51L, 51L,51L,51L,51L,51L,51L,51L,51L,51L,51L,51L,51L,51L,51L, 51L,51L,51L,51L,52L,52L,52L,52L,52L,52L,52L,52L,52L,52L, 52L,52L,52L,52L,52L,52L,52L,52L,52L,52L,52L,52L,53L,53L, 53L,53L,53L,53L,53L,53L,53L,53L,53L,53L,53L,53L,53L,53L, 53L,53L,53L,53L,53L,53L,54L,54L,54L,54L,54L,54L,54L,54L, 54L,54L,54L,54L,54L,54L,54L,54L,54L,54L,54L,54L,54L,54L, 70L,70L,70L,70L,70L,70L,70L,70L,70L,70L,70L,70L,70L,70L, 70L,70L,70L,70L,70L,70L,70L,70L,80L,80L,80L,80L,80L,80L, 80L,80L,80L,80L,80L,80L,80L,80L,80L,80L,80L,80L,80L,80L, 80L,80L,90L,90L,90L,90L,90L,90L,90L,90L,90L,90L,90L,90L, 90L,90L,90L,90L,90L,90L,90L,90L,90L,90L),年份= c(1990L, 1991L,1992L,1993L,1994L,1995L,1996L,1997L,1998L,1999L,2000L, 2001L,2002L,2003L,2004L,2005L,2006L,2007L,2008L,2009L,2010L, 2011L,1990L,1991L,1992L,1993L,1994L,1995L,1996L,1997L,1998L, 1999L,2000L,2001L,2002L,2003L,2004L,2005L,2006L,2007L,2008L, 2009L,2010L,2011L,1990L,1991L,1992L,1993L,1994L,1995L,1996L, 1997L,1998L,1999L,2000L,2001L,2002L,2003L,2004L,2005L,2006L, 2007L,2008L,2009L,2010L,2011L,1990L,1991L,1992L,1993L,1994L, 1995L,1996L,1997L,1998L,1999L,2000L,2001L,2002L,2003L,2004L, 2005L,2006L,2007L,2008L,2009L,2010L,2011L,1990L,1991L,1992L, 1993L,1994L,1995L,1996L,1997L,1998L,1999L,1999L,2000L,2001L, 2002L,2003L,2004L,2005L,2006L,2007L,2008L,2009L,2010L,2011L, 1990L,1991L,1992L,1993L,1994L,1995L,1996L,1997L,1998L,1999L, 2000L,2001L,2002L,2003L,2004L,2005L,2006L,2007L,2008L,2009L, 2010L,2011L,1990L,1991L,1992L,1993L,1994L,1995L,1996L,1997L, 1998L,1999L,2000L,2001L,2002L,2003L,2004L,2005L,2006L,2007L, 2008L,2009L,2010L,2011L,1990L,1991L,1992L,1993L,1994L,1995L, 1996L,1997L,1998L,1999L,2000L,2001L,2002L,2003L,2004L,2005L, 2006L,2007L,2008L,2009L,2010L,2011L,1990L,1991L,1992L,1993L, 1994L,1995L,1996L,1997L,1998L,1999L,2000L,2001L,2002L,2003L, 2004L,2005L,2006L,2007L,2008L,2009L,2010L,2011L,1990L,1991L, 1992L,1993L,1994L,1995L,1996L,1997L,1998L,1999L,2000L,2001L, 2002L,2003L,2004L,2005L,2006L,2007L,2008L,2009L,2010L,2011L, 1990L,1991L,1992L,1993L,1994L,1995L,1996L,1997L,1998L,1999L, 2000L,2001L,2002L,2003L,2004L,2005L,2006L,2007L,2008L,2009L, 2010L,2011L),country = structure(c(1L,1L,1L,1L,1L,1L,1L,1L, 1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,4L,4L,4L, 4L,4L,4L,4L,4L,4L,4L,4L,4L,4L,4L,4L,4L,4L,4L,4L,4L, 4L,4L,8L,8L,8L,8L,8L,8L,8L,8L,8L,8L,8L,8L,8L,8L,8L, 8L,8L,8L,8L,8L,8L,8L,6L,6L,6L,6L,6L,6L,6L,6L,6L,6L, 6L,6L,6L,6L,6L,6L,6L,6L,6L,6L,6L,6L,9L,9L,9L,9L,9L, 9L,9L,9L,9L,9L,9L,9L,9L,9L,9L,9L,9L,9L,9L,9L,9L,9L, 9L,11L,11L,11L,11L,11L,11L,11L,11L,11L,11L,11L,11L,11L, 11L,11L,11L,11L,11L,11L,11L,11L,11L,2L,2L,2L,2L,2L,2L, 2L,2L,2L,2L,2L,2L,2L,2L,2L,2L,2L,2L,2L,2L,2L,2L,5L, 5L,5L,5L,5L,5L,5L,5L,5L,5L,5L,5L,5L,5L,5L,5L,5L,5L, 5L,5L,5L,5L,10L,10L,10L,10L,10L,10L,10L,10L,10L,10L, 10L,10L,10L,10L,10L,10L,10L,10L,10L,10L,10L,10L,3L,3L, 3L,3L,3L,3L,3L,3L,3L,3L,3L,3L,3L,3L,3L,3L,3L,3L,3L, 3L,3L,3L,7L,7L,7L,7L,7L,7L,7L,7L,7L,7L,7L,7L,7L,7L, 7L,7L,7L,7L,7L,7L,7L,7L),。标签= c("巴哈马","巴巴多斯", " Belize"," Cuba"," Dominica"," Dominican Republic"," Guatemala", " Haiti"," Jamaica"," Mexico"," Trinidad and Tobago"),class = "因子&#34), submission = c(1L,0L,0L,0L,0L,1L,0L,1L,0L,1L,0L, 1L,0L,1L,0L,1L,0L,0L,0L,1L,0L,1L,1L,0L,1L,0L, 1L,0L,0L,1L,0L,1L,0L,1L,0L,1L,0L,1L,0L,1L,0L, 1L,0L,1L,1L,0L,0L,1L,0L,0L,0L,1L,0L,0L,1L,0L, 0L,0L,0L,0L,1L,0L,1L,1L,0L,0L,0L,0L,1L,0L,0L, 0L,0L,1L,0L,1L,0L,1L,0L,1L,0L,1L,1L,1L,0L,1L, 0L,0L,1L,0L,1L,0L,0L,0L,0L,0L,1L,1L,1L,0L,0L, 1L,1L,0L,1L,0L,1L,0L,1L,0L,0L,1L,0L,0L,0L,1L, 0L,0L,1L,0L,1L,0L,1L,0L,0L,0L,1L,0L,1L,1L,0L, 1L,0L,1L,0L,0L,1L,0L,1L,0L,0L,1L,1L,0L,0L,1L, 0L,0L,0L,1L,0L,0L,1L,0L,0L,1L,0L,0L,0L,0L,0L, 1L,1L,0L,0L,1L,1L,0L,1L,0L,0L,1L,0L,1L,0L,0L, 0L,1L,0L,1L,0L,1L,0L,0L,1L,0L,1L,0L,1L,0L,1L, 1L,0L,1L,0L,1L,0L,1L,0L,1L,0L,0L,0L,0L,0L,0L, 1L,0L,0L,1L,0L,0L,1L,0L,0L,1L,0L,1L,0L,1L,1L, 0L,0L,1L,0L,0L,0L,1L,1L,0L,1L,1L,0L,1L,1L,0L, 1L,0L,1L,0L,1L,0L,0L),routineyear = c(1L,0L,0L, 1L,0L,0L,0L,1L,0L,1L,0L,1L,0L,1L,0L,1L,0L,1L, 0L,0L,0L,1L,0L,0L,1L,0L,1L,0L,0L,1L,0L,1L,0L, 1L,0L,1L,0L,1L,0L,1L,0L,1L,0L,1L,0L,0L,1L,0L, 0L,0L,0L,0L,0L,1L,0L,0L,0L,0L,0L,0L,0L,0L,0L, 0L,0L,1L,0L,0L,1L,0L,0L,0L,0L,1L,0L,1L,0L,1L, 0L,1L,0L,1L,0L,0L,0L,1L,0L,0L,1L,0L,1L,0L,1L, 0L,0L,1L,0L,0L,0L,0L,1L,0L,0L,0L,1L,0L,1L,0L, 0L,0L,1L,0L,0L,1L,0L,0L,0L,0L,1L,0L,1L,0L,1L, 0L,1L,0L,0L,0L,0L,0L,0L,1L,0L,1L,0L,1L,0L,0L, 0L,0L,1L,0L,0L,0L,1L,0L,0L,0L,0L,0L,0L,0L,0L, 0L,1L,0L,0L,1L,0L,0L,0L,0L,0L,0L,1L,0L,0L,0L, 1L,0L,1L,0L,0L,0L,0L,0L,1L,0L,0L,1L,0L,1L,0L, 0L,1L,0L,1L,0L,1L,0L,1L,0L,0L,1L,0L,1L,0L,1L, 0L,1L,0L,0L,0L,1L,0L,0L,1L,0L,1L,0L,0L,0L,0L, 0L,1L,0L,0L,1L,0L,0L,0L,0L,0L,1L,0L,1L,0L,0L, 0L,0L,1L,0L,0L,0L,0L,0L,1L,0L,1L,0L,1L,0L,0L )),。姓名= c(" ccode","年","国家/#34;,"提交","例行年&# 34;),class =" data.frame",row.names = c(NA,-243L))
答案 0 :(得分:2)
使用data.table:
library(data.table)
setDT(DF)
DF[, Y := 0
][routineyear == 1
, Y := 1 + (shift(submission, fill = 1) == 0)
, by = country][]
给出(显示前15行):
> DF ccode year country submission routineyear Y 1: 31 1990 Bahamas 1 1 1 2: 31 1991 Bahamas 0 0 0 3: 31 1992 Bahamas 0 0 0 4: 31 1993 Bahamas 0 1 1 5: 31 1994 Bahamas 0 0 0 6: 31 1995 Bahamas 1 0 0 7: 31 1996 Bahamas 0 0 0 8: 31 1997 Bahamas 1 1 2 9: 31 1998 Bahamas 0 0 0 10: 31 1999 Bahamas 1 1 1 11: 31 2000 Bahamas 0 0 0 12: 31 2001 Bahamas 1 1 1 13: 31 2002 Bahamas 0 0 0 14: 31 2003 Bahamas 1 1 1 15: 31 2004 Bahamas 0 0 0 ........
这是做什么的:
setDT(DF)
将您的数据框转换为data.table Y := 0
首先通过引用将Y
设置为0
routineyear == 1
Y
,以便Y
设置为1
,前一次提交为1
,而2
上次提交时为0
models.User:
id = pk
username = text
models.Offer
id = pk
description = text
publicationDate = Date
user = Fk(User)
my serializer is:
class UserOfferSerializer(ModelSerializer):
offers = OfferSerializerAll(many=True, read_only=True)
class Meta:
model = User
fields = ('id', 'username', 'offers')
} 答案 1 :(得分:1)
library(dplyr)
select(dat2, -Y) %>%
filter(routineyear == 1L) %>%
group_by(country) %>%
mutate(Y = 2L - lag(submission, default = 1L)) %>%
ungroup() %>%
right_join(select(dat2, -Y)) %>%
mutate(Y = replace(Y, is.na(Y), 0L))
# # A tibble: 14 x 5
# country year submission routineyear Y
# <fct> <int> <int> <int> <int>
# 1 A 1990 1 1 1
# 2 A 1991 0 0 0
# 3 A 1992 1 1 1
# 4 A 1993 1 0 0
# 5 A 1994 0 0 0
# 6 A 1995 0 0 0
# 7 A 1996 0 1 1
# 8 A 1997 0 0 0
# 9 A 1998 1 1 2
# 10 A 1999 0 0 0
# 11 A 2000 0 0 0
# 12 A 2001 0 1 1
# 13 A 2002 0 0 0
# 14 A 2003 1 1 2
all.equal(.Last.value, dat2)
# [1] TRUE
其中dat2
是:
dat2 <- read.table(text =
"country year submission routineyear Y
A 1990 1 1 1
A 1991 0 0 0
A 1992 1 1 1
A 1993 1 0 0
A 1994 0 0 0
A 1995 0 0 0
A 1996 0 1 1
A 1997 0 0 0
A 1998 1 1 2
A 1999 0 0 0
A 2000 0 0 0
A 2001 0 1 1
A 2002 0 0 0
A 2003 1 1 2
", header = TRUE)