如何找到因子中两个日期之间的差异?

时间:2017-10-04 01:55:50

标签: r format as.date

我有一个名为“enrollments”的数据框:

"enrollments" data frame

enrolled_at,unenrolled_at和fully_participated_at是因素。我想在我的数据框中添加一个新列,指示两个非空属性之间的小时差异。此新列的类型并不重要,但必须以此格式显示时间(HH MM SS)。

我想做以下伪代码:

If (unenrolled_at == empty && fully_participated_at != empty) 
    newAttributeValue = fully_participated_at - enrolled_at
else if (unenrolled_at != empty && fully_participated_at == empty)
    newAttributeValue = unenrolled_at - enrolled_at
else
    do nothing

编辑:我尝试了网站中的所有方法来执行此操作,但它们不起作用。时间存储在我的数据帧中作为因子类,但站点中的解决方案是因子因子或(字符串)时间 - (字符串)时间。我也分别尝试了“as.character”和“as.Date”函数。所以我的问题不重复。 Rolando Tamayo提供了不同的方法来解决我的问题,但它给了我错误:“ymd_hms中的错误(注释$ unenrolled_at):找不到函数”ymd_hms“”(我安装了lubridate包)

1 个答案:

答案 0 :(得分:1)

您可以使用包lubridate

library(lubridate)


#Create a df with dates

df<-tibble::tibble(
  enrolled_at=as.factor(c("2002-06-09 12:45:40 UTC","2003-01-29 09:30:40 UTC",
                         "2002-09-04 16:45:40 UTC")),
 unenrolled_at=as.factor(c("2002-11-13 20:00:40 UTC",
                        "2002-07-07 17:30:40","2002-07-07 17:30:40 UTC")))
df
# A tibble: 3 x 2
              enrolled_at           unenrolled_at
                   <fctr>                  <fctr>
1 2002-06-09 12:45:40 UTC 2002-11-13 20:00:40 UTC
2 2003-01-29 09:30:40 UTC     2002-07-07 17:30:40
3 2002-09-04 16:45:40 UTC 2002-07-07 17:30:40 UTC
#Check Class
class(df$enrolled_at)
[1] "factor"
#Check class after function ymd_hms
class(ymd_hms(df$enrolled_at))
[1] "POSIXct" "POSIXt"
#Calculete de difference in days
dif<-ymd_hms(df$ unenrolled_at)-ymd_hms(df$enrolled_at)

#difference like a period
as.period(dif)
 [1] "157d 7H 15M 0S"    "-205d -16H 0M 0S"  "-58d -23H -15M 0S"
#Add as a column in df
df$newAttributeValue<-as.period(ymd_hms(df$ unenrolled_at)-ymd_hms(df$enrolled_at))

df
# A tibble: 3 x 3
              enrolled_at           unenrolled_at newAttributeValue
                   <fctr>                  <fctr>      <S4: Period>
1 2002-06-09 12:45:40 UTC 2002-11-13 20:00:40 UTC    157d 7H 15M 0S
2 2003-01-29 09:30:40 UTC     2002-07-07 17:30:40  -205d -16H 0M 0S
3 2002-09-04 16:45:40 UTC 2002-07-07 17:30:40 UTC -58d -23H -15M 0S