我有一个data.frame
,其变量的数据类型为list
,其值采用Date
格式。如何计算来自2个不同变量的两个日期之间的差异,并分别命名为YrsEmployed
和数据类型list
?
请注意,下面的StartHireDate
和EndHireDate
的格式为Date
。我只是不知道如何将它们显示为Date
> > print(HiringDateInfo)
X_id StartHireDate
1 530eed6dbfb5c1a8e77cb0fc NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
2 5391a88bbfb5c1b1fed0bcf4 NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
3 53a0fa3cf1f17922a0287add NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
4 53abd15cf1f179c3e81a3fbe NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
5 54dd934ff1f179acfb7b0a2f 14304, 15095, 15279, 15431, 15492, 15645, 15859, NA, 16222, 16375
EndHireDate
1 NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
2 NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
3 NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
4 NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
5 15063, 15308, 15338, 15490, 15613, 15855, 16116, 16159, 16312, NA
我希望新的data.frame
和YrsEmployed
YrsEmployed
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
2.07945205479452,0.583561643835616,0.161643835616438,0.161643835616438,0.3315068,49315069,0.575342465753425,0.704109589041096,NA,0.246575342465753,NA
答案 0 :(得分:0)
我建议不要为单个变量(如YrsEmployed)计算新的df。 通过加载dplyr软件包,您可以更改新列YrsEmployed。 首先,使用以下代码将列更改为日期:
HiringDateInfo$StartHireDate <- as.Date(HiringDateInfo$StartHireDate, format = "depending on your formate")
HiringDateInfo$EndHireDate <- as.Date(HiringDateInfo$EndHireDate, format = "depending on your formate")
之后,您可以使用dplyr变异函数来计算YrsEmployed。 希望它能起作用!
答案 1 :(得分:0)
这是我为解决此问题所做的工作。
1.我定义一个函数function(x, y) list(((x-y)/365)*1)
2.然后使用mapply
附带所需的新变量mapply(fdiff, HiringDateInfo$EndHireDate, HiringDateInfo$StartHireDate)