虚拟按日期变量。

时间:2014-03-04 05:18:42

标签: r

Noob在这里提问。我正在尝试使用虚拟变量创建一个新列,以解释在最近的经济衰退期间餐厅是否关闭。此数据将合并到不同的数据帧。提前致谢。

最大日期与另一个虚拟变量相结合,表明餐厅是否关闭。经济衰退的日期是12/2007 - 06/2009。如果餐馆的最后评论是在这个范围内那么1.不确定我是否应该使用合理的,提前感谢。

我目前的代码:

reviews3 <- ddply(reviews, "business_id", summarise, (max(date)))

newdata <- merge(expandedDataFrame2, reviews3, by="business_id")

数据框架结构:

str(reviews) data.frame':   229907 obs. of  2 variables:
 $ date       : Date, format: "2011-01-26" "2011-07-27" "2012-06-14" "2010-05-27" ...
 $ business_id: chr  "9yKzy9PApeiPPOUJEtnvkg" "ZRJwVLyzEJq1VAihDhYiow" "6oRAC4uyJCsJl1X0WZpVSA" "_1QQZuf4zZOyFCvXc0o6Vg" ...

样品:

row.names   date    business_id 
1   X1  2011-01-26  9yKzy9PApeiPPOUJEtnvkg
2   X2  2011-07-27  ZRJwVLyzEJq1VAihDhYiow
3   X3  2012-06-14  6oRAC4uyJCsJl1X0WZpVSA
4   X4  2010-05-27  _1QQZuf4zZOyFCvXc0o6Vg
5   X5  2012-01-05  6ozycU1RpktNG2-1BroVtw
6   X6  2007-12-13  -yxfBYGB6SEqszmxJxd97A
7   X7  2010-02-12  zp713qNhx8d9KCJJnrw1xA
8   X8  2012-07-12  hW0Ne_HTHEAgGF1rAdmR-g
9   X9  2012-08-17  wNUea3IXZWD63bbOQaOH-g
10  X10 2010-08-11  nMHhuYan8e3cONo3PornJA

1 个答案:

答案 0 :(得分:2)

dat$dummy <- ifelse(dat$date > as.Date("12/01/2007", format = "%M/%d/%Y") &
                    dat$date < as.Date("06/01/2009", format = "%M/%d/%Y"), 0, 1)

> dat
   row.names       date            business_id dummy
1         X1 2011-01-26 9yKzy9PApeiPPOUJEtnvkg     1
2         X2 2011-07-27 ZRJwVLyzEJq1VAihDhYiow     1
3         X3 2012-06-14 6oRAC4uyJCsJl1X0WZpVSA     1
4         X4 2010-05-27 _1QQZuf4zZOyFCvXc0o6Vg     1
5         X5 2012-01-05 6ozycU1RpktNG2-1BroVtw     1
6         X6 2007-12-13 -yxfBYGB6SEqszmxJxd97A     0
7         X7 2010-02-12 zp713qNhx8d9KCJJnrw1xA     1
8         X8 2012-07-12 hW0Ne_HTHEAgGF1rAdmR-g     1
9         X9 2012-08-17 wNUea3IXZWD63bbOQaOH-g     1
10       X10 2010-08-11 nMHhuYan8e3cONo3PornJA     1