我的数据看起来像这样:
Data <- "Person Address Starting.Date Resignation.Date Job
John abc 01.01.2017 03.01.2017 IT
Sarah cde 06.01.2017 06.07.2017 Teacher
Susi bfg 09.06.2017 08.09.2017 secretary"
Data <- read.table(text=zz, header = TRUE)
我的目标是找出人们在辞职之前待了多长时间,然后将这些信息放入新变量中。因此,我检查辞职日期是否在某个日期范围内,如何使用此代码进行操作:
Data$Span<- ifelse(Data$Resignation.Date>= "01.01.2017" & Data$Resignation.Date <= "31.01.2017", 1,
ifelse(Data$Resignation.Date>= "01.02.2017" & Data$Resignation.Date <= "28.02.2017", 2,
ifelse(Data$Resignation.Date>= "01.03.2017" & Data$Resignation.Date <= "31.03.2017", 3,
ifelse(Data$Resignation.Date>= "01.04.2017" & Data$Resignation.Date <= "30.04.2017", 4,
ifelse(Data$Resignation.Date>="01.05.2017" & Data$Resignation.Date <= "31.05.2017",5,
ifelse(Data$Resignation.Date>="01.06.2017" & Data$Resignation.Date<="30.06.2017",6,
ifelse(Data$Resignation.Date>="01.07.2017" & Data$Resignation.Date<="31.07.2017",7,
ifelse(Data$Resignation.Date>="01.08.2017" & Data$Resignation.Date<="31.08.2017", 8,
ifelse(Data$Resignation.Date>="01.09.2017" & Data$Resignation.Date<="30.09.2017", 9,
ifelse(Data$Resignation.Date>="01.10.2017" & Data$Resignation.Date<="31.10.2017",10,
ifelse(Data$Resignation.Date>="01.11.2017" & Data$Resignation.Date<="30.11.2017", 11,
ifelse(Data$Resignation.Date>="01.12.2017" & Data$Resignation.Date<="31.12.2017",12,999))))))))))))
我提供的数据是针对在一月份开始工作的人员的子集的。我有2017年所有12个月的子集。我要为在2月,3月等开始工作的人员使用相同的代码。为此,我必须更改代码,因为它从第一行开始并添加一个月,然后为所有后续行添加一个月。这样,例如对于二月子集,它将以
开头Data$Resignation.Date>= "01.02.2017" & Data$Resignation.Date <= "28.02.2017.2017", 1,
并以
结尾 ifelse(Data$Resignation.Date>="01.01.2018" & Data$Resignation.Date<="31.01.2018",12,999
是否有任何方法无需复制粘贴代码并每月手动进行更改?由于更改遵循一定的系统性原则,因此我认为这是可能的,但是我找不到任何解决方案。我以dplyr软件包的形式寻找解决方案,因为我认为我的问题适合那里,但这无济于事。我将非常感谢您的任何建议。当然,我会很乐意回答其余问题。
P.S .:我并不喜欢使用这些子集,这对我来说更容易工作,因为我对r的了解并不丰富。我使用此代码过滤了子集
Data <- TotalData[TotalData$Starting.Date>= "01.01.2017" & TotalData$Starting.Date <= "31.01.2017",]
答案 0 :(得分:0)
我认为这段代码足以完成您的工作:- 逻辑是,如果“开始日期”和“结束”数据相同,则将为您提供1;如果它们不同,则将为您提供员工在公司工作多少个月的月份差异
library(lubridate)
Data$Starting.Date <- dmy(Data$Starting.Date)
Data$Resignation.Date <- dmy(Data$Resignation.Date)
Data$code<- ifelse(month(Data$Starting.Date) == month(Data$Resignation.Date),1,(interval(Data$Starting.Date, Data$Resignation.Dat) %/% months(1)))
数据:-
Data <- structure(list(Person = structure(1:4, .Label = c("John", "johnyy",
"Sarah", "Susi"), class = "factor"), Address = structure(c(1L,
1L, 3L, 2L), .Label = c("abc", "bfg", "cde"), class = "factor"),
Starting.Date = structure(c(17167, 17199, 17172, 17326), class = "Date"),
Resignation.Date = structure(c(17169, 17199, 17353, 17417
), class = "Date"), Job = structure(c(1L, 1L, 3L, 2L), .Label = c("IT",
"secretary", "Teacher"), class = "factor"), code = c(1, 2,
999, 999)), row.names = c(NA, -4L), class = "data.frame")
答案 1 :(得分:0)
您可以使用lubridate软件包进行操作,以获得一个人留在公司的时间。
library(lubridate)
Data <- "Person Address Starting.Date Resignation.Date Job
John abc 01.01.2017 03.01.2017 IT
Sarah cde 06.01.2017 06.07.2017 Teacher
Susi bfg 09.06.2017 08.09.2017 secretary"
Data <- read.table(text=Data, header = TRUE)
Data$Starting.Date = dmy(Data$Starting.Date)
Data$Resignation.Date = dmy(Data$Resignation.Date)
time.interval <- Data$Starting.Date %--% Data$Resignation.Date
time.period <- as.period(time.interval)
time.period <- month(time.period)
Data$Span <- time.period