如何在R中的data.table中按天计算每周小时数

时间:2017-05-28 12:10:42

标签: r dataframe data.table

我有一个data.table days_dt

days_dt <- data.table(day = c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"))

看起来像

days_dt
day
1:    Monday
2:   Tuesday
3: Wednesday
4:  Thursday
5:    Friday
6:  Saturday
7:    Sunday

我有另一个单独的记录data.table,我每天都要和每天一样:

 > weighted_average_time
  mon_from_time mon_to_time tue_from_time tue_to_time wed_from_time wed_to_time thu_from_time
 1      7.965174    21.39378      7.965174    21.39378      7.965174    21.39378      7.965174
  thu_to_time fri_from_time fri_to_time sat_from_time sat_to_time sun_from_time sun_to_time
 1    21.39876      7.965174    21.39876      7.942786    21.35149      9.766915    16.91617

我希望在第一个表days_dt中找到与时间相关的日间差异(在新列中)。周一的例子(21.39378 - 7.965174 = 13.42861)

如何使用R

中的data.table执行此操作

预期输出必须看起来像

days_dt
day     time_diff
Monday  13.42861
.       .
.       .
and so on for all the days

1 个答案:

答案 0 :(得分:1)

我们melt第二个数据集为long格式,按照变量&#39;的子字符串进行分组。即只有&#39; mon,&#39; tue&#39;等,才能得到“&#39;值”的差异。列,并使用on

创建分组列的原始数据集加入substr
days_dt[, grp := tolower(substr(day, 1, 3))][]
days_dt[ melt(setDT(weighted_average_time))[,  diff(value) , 
     .(grp = sub("_.*", "", variable))], time_diff := V1, on = 'grp']
days_dt[, grp := NULL][]
#        day time_diff
#1:    Monday 13.428606
#2:   Tuesday 13.428606
#3: Wednesday 13.428606
#4:  Thursday 13.433586
#5:    Friday 13.433586
#6:  Saturday 13.408704
#7:    Sunday  7.149255