在Datespan中多次检查日期

时间:2018-12-07 09:40:21

标签: r

我的数据看起来像这样:

Data <- "Person Address Starting.Date  Resignation.Date Job
 John         abc       01.01.2017    03.01.2017        IT      
  Sarah        cde      06.01.2017 06.07.2017       Teacher
  Susi         bfg     09.06.2017  08.09.2017     secretary"
Data <- read.table(text=zz, header = TRUE)

我的目标是找出人们在辞职之前待了多长时间,然后将这些信息放入新变量中。因此,我检查辞职日期是否在某个日期范围内,如何使用此代码进行操作:

Data$Span<- ifelse(Data$Resignation.Date>= "01.01.2017" & Data$Resignation.Date <= "31.01.2017", 1, 
                              ifelse(Data$Resignation.Date>= "01.02.2017" & Data$Resignation.Date <= "28.02.2017", 2,
                                     ifelse(Data$Resignation.Date>= "01.03.2017" & Data$Resignation.Date <= "31.03.2017", 3,
                                            ifelse(Data$Resignation.Date>= "01.04.2017" & Data$Resignation.Date <= "30.04.2017", 4,
                                                   ifelse(Data$Resignation.Date>="01.05.2017" & Data$Resignation.Date <= "31.05.2017",5, 
                                                          ifelse(Data$Resignation.Date>="01.06.2017" & Data$Resignation.Date<="30.06.2017",6, 
                                                                 ifelse(Data$Resignation.Date>="01.07.2017" & Data$Resignation.Date<="31.07.2017",7, 
                                                                        ifelse(Data$Resignation.Date>="01.08.2017" & Data$Resignation.Date<="31.08.2017", 8,
                                                                               ifelse(Data$Resignation.Date>="01.09.2017" & Data$Resignation.Date<="30.09.2017", 9,
                                                                                      ifelse(Data$Resignation.Date>="01.10.2017" & Data$Resignation.Date<="31.10.2017",10,
                                                                                             ifelse(Data$Resignation.Date>="01.11.2017" & Data$Resignation.Date<="30.11.2017", 11,
                                                                                                    ifelse(Data$Resignation.Date>="01.12.2017" & Data$Resignation.Date<="31.12.2017",12,999))))))))))))

我提供的数据是针对在一月份开始工作的人员的子集的。我有2017年所有12个月的子集。我要为在2月,3月等开始工作的人员使用相同的代码。为此,我必须更改代码,因为它从第一行开始并添加一个月,然后为所有后续行添加一个月。这样,例如对于二月子集,它将以

开头
Data$Resignation.Date>= "01.02.2017" & Data$Resignation.Date <= "28.02.2017.2017", 1,

并以

结尾
 ifelse(Data$Resignation.Date>="01.01.2018" & Data$Resignation.Date<="31.01.2018",12,999

是否有任何方法无需复制粘贴代码并每月手动进行更改?由于更改遵循一定的系统性原则,因此我认为这是可能的,但是我找不到任何解决方案。我以dplyr软件包的形式寻找解决方案,因为我认为我的问题适合那里,但这无济于事。我将非常感谢您的任何建议。当然,我会很乐意回答其余问题。

P.S .:我并不喜欢使用这些子集,这对我来说更容易工作,因为我对r的了解并不丰富。我使用此代码过滤了子集

Data <- TotalData[TotalData$Starting.Date>= "01.01.2017" & TotalData$Starting.Date <= "31.01.2017",]

2 个答案:

答案 0 :(得分:0)

我认为这段代码足以完成您的工作:- 逻辑是,如果“开始日期”和“结束”数据相同,则将为您提供1;如果它们不同,则将为您提供员工在公司工作多少个月的月份差异

library(lubridate)

Data$Starting.Date <- dmy(Data$Starting.Date)
Data$Resignation.Date <- dmy(Data$Resignation.Date)

Data$code<- ifelse(month(Data$Starting.Date) == month(Data$Resignation.Date),1,(interval(Data$Starting.Date, Data$Resignation.Dat) %/% months(1)))

数据:-

Data <- structure(list(Person = structure(1:4, .Label = c("John", "johnyy", 
"Sarah", "Susi"), class = "factor"), Address = structure(c(1L, 
1L, 3L, 2L), .Label = c("abc", "bfg", "cde"), class = "factor"), 
    Starting.Date = structure(c(17167, 17199, 17172, 17326), class = "Date"), 
    Resignation.Date = structure(c(17169, 17199, 17353, 17417
    ), class = "Date"), Job = structure(c(1L, 1L, 3L, 2L), .Label = c("IT", 
    "secretary", "Teacher"), class = "factor"), code = c(1, 2, 
    999, 999)), row.names = c(NA, -4L), class = "data.frame")

答案 1 :(得分:0)

您可以使用lubridate软件包进行操作,以获得一个人留在公司的时间。

library(lubridate)
Data <- "Person Address Starting.Date  Resignation.Date Job
 John         abc       01.01.2017    03.01.2017        IT      
 Sarah        cde      06.01.2017 06.07.2017       Teacher
 Susi         bfg     09.06.2017  08.09.2017     secretary"
Data <- read.table(text=Data, header = TRUE)

Data$Starting.Date = dmy(Data$Starting.Date)
Data$Resignation.Date = dmy(Data$Resignation.Date)


time.interval <- Data$Starting.Date %--% Data$Resignation.Date
time.period <- as.period(time.interval)
time.period <- month(time.period)
Data$Span <- time.period