在R

时间:2017-09-14 01:18:12

标签: r date time

我的赛季时间是次年10月1日至次年3月31日。我是如何为赛季创建一个虚拟变量来看到那个人进出的?

 df <- data.frame(ID= c(1:6), 
             Drug = c("A","C","A","A","B","A"),
             Start = c("01/01/2009","07/10/2010","10/10/2009","03/01/2011","03/01/2012","04/12/2010"),
             End=c("09/10/2009","04/20/2011","07/20/1010","01/01/2012","04/01/2013","09/30/2011"))

我的输出:

   ID Drug      Start        End Season
1   1    A 01/01/2009 09/10/2009      1
2   1    A 01/01/2009 09/10/2009      0
3   2    C 07/10/2010 04/20/2011      0
4   2    C 07/10/2010 04/20/2011      1
5   2    C 07/10/2010 04/20/2011      0
6   3    A 10/10/2009 07/20/1010      1
7   3    A 10/10/2009 07/20/1010      0
8   3    A 10/10/2009 07/20/1010      1
9   4    B 03/01/2011 01/01/2012      1
10  4    B 03/01/2011 01/01/2012      0
11  4    B 03/01/2011 01/01/2012      1
12  5    A 03/01/2012 04/01/2013      1
13  5    A 03/01/2012 04/01/2013      0
14  5    A 03/01/2012 04/01/2013      1
15  5    A 03/01/2012 04/01/2013      0
16  6    A 04/12/2010 09/30/2011      0
ID 1:她从01/01开始到09/10结束。

[01/01, 03/31] =1

[03/31,09/10] = 0

ID 2:她从07/10开始到04/20结束。我检查了

[07/10, 10/01] = 0

[10/01,03/31] = 1

[03/31, 04/20] = 0

ID5她于03/01开始,于04/01结束

[03/01, 03/31]= 1

[03/31, 10/01] = 0

[10/01, 03/31] = 1

[03/31, 04/01] = 0

1 个答案:

答案 0 :(得分:1)

我认为我使用下面的代码更正了ExposedIn和ExposedOut(注意:您需要在创建数据框时添加&#39; stringsAsFactors = FALSE&#39;)。但是,我没有足够的时间来计算所涵盖的整个季节的额外总和 - 我会通过添加另一个具有日期/时间功能的列来考虑总治疗时间。

df$Start <- as.Date(df$Start, format = '%m/%d/%Y')
df$End <- as.Date(df$End, format = '%m/%d/%Y')
df$SeasonIn <- 274 # 275 in leap years
df$SeasonOut <- 90 # 91 in leap years
df$ExposedIn <- as.integer(as.POSIXlt(df$Start)$yday >= df$SeasonIn | 
as.POSIXlt(df$Start)$yday < df$SeasonOut)
df$ExposedOut <- as.integer(as.POSIXlt(df$End)$yday >= df$SeasonIn | 
as.POSIXlt(df$End)$yday < df$SeasonOut)

希望这至少可以帮助一些人。