如果我有这个数据
C=
year month day hour minute rain
2010 01 01 00 00 0.000
2011 01 01 00 00 0.000
2012 01 01 00 00 0.000
2013 01 01 00 00 0.000
2014 01 01 00 00 0.000
2015 01 01 00 15 0.000
和参考数据如:
R=
year month day hour minute rain
2013 01 01 00 00 0.000
2013 01 01 00 05 0.000
2013 01 01 00 10 0.000
2013 01 01 00 15 0.000
2013 01 01 00 20 0.000
2014 01 01 00 00 0.000
2014 01 01 00 05 0.000
2014 01 01 00 10 0.000
2014 01 01 00 15 0.000
2014 01 01 00 20 0.000
2015 01 01 00 00 0.000
2015 01 01 00 05 0.000
2015 01 01 00 10 0.000
2015 01 01 00 15 0.000
2015 01 01 00 20 0.000
我需要完成这个M
M=
year month day hour minute rain
2013 01 01 00 00 0.000
2013 01 01 00 05 0.000
2013 01 01 00 10 0.000
2013 01 01 00 15 0.000
2013 01 01 00 20 0.000
2014 01 01 00 00 0.000
2014 01 01 00 05 0.000
2014 01 01 00 10 0.000
2014 01 01 00 15 0.000
2014 01 01 00 20 0.000
2015 01 01 00 15 0.000
2015 01 01 00 20 0.000
如你所见,2015年的M开始于" 2015 01 01 00 15 0.000"并且我们可以使用C信息来创建一个nrow(c)循环并查找开始日期,这个想法是使用循环完成此数据框并匹配以从参考数据R填充年,月,日,小时和分钟,并填充空雨柱用" NaN"。最终的输出是:
F=
year month day hour minute rain
2013 01 01 00 00 0.000
2013 01 01 00 05 0.000
2013 01 01 00 10 0.000
2013 01 01 00 15 0.000
2013 01 01 00 20 0.000
2014 01 01 00 00 0.000
2014 01 01 00 05 0.000
2014 01 01 00 10 0.000
2014 01 01 00 15 0.000
2014 01 01 00 20 0.000
2015 01 01 00 00 NaN
2015 01 01 00 05 NaN
2015 01 01 00 10 NaN
2015 01 01 00 15 0.000
2015 01 01 00 20 0.000
答案 0 :(得分:3)
要从参考数据R
填写data.table
中缺少的行,可以使用library(data.table)
setDT(M)[setDT(R)[, -"rain"], on = .(year, month, day, hour, minute)]
实现为右连接。因此,不需要循环。
year month day hour minute rain
1: 2013 1 1 0 0 0
2: 2013 1 1 0 5 0
3: 2013 1 1 0 10 0
4: 2013 1 1 0 15 0
5: 2013 1 1 0 20 0
6: 2014 1 1 0 0 0
7: 2014 1 1 0 5 0
8: 2014 1 1 0 10 0
9: 2014 1 1 0 15 0
10: 2014 1 1 0 20 0
11: 2015 1 1 0 0 NA
12: 2015 1 1 0 5 NA
13: 2015 1 1 0 10 NA
14: 2015 1 1 0 15 0
15: 2015 1 1 0 20 0
R
OP已询问here和here如何控制岁月。由于上面的代码暗示了右连接,因此R
的所有行都出现在结果集中。因此,setDT(M)[setDT(R)[year == 2014L, -"rain"], on = .(year, month, day, hour, minute)]
需要适当过滤。这可以通过明确指定一年来完成
year month day hour minute rain
1: 2014 1 1 0 0 0
2: 2014 1 1 0 5 0
3: 2014 1 1 0 10 0
4: 2014 1 1 0 15 0
5: 2014 1 1 0 20 0
setDT(M)[setDT(R)[year %in% 2014:2018, -"rain"], on = .(year, month, day, hour, minute)]
或一系列年份
year month day hour minute rain
1: 2014 1 1 0 0 0
2: 2014 1 1 0 5 0
3: 2014 1 1 0 10 0
4: 2014 1 1 0 15 0
5: 2014 1 1 0 20 0
6: 2015 1 1 0 0 NA
7: 2015 1 1 0 5 NA
8: 2015 1 1 0 10 NA
9: 2015 1 1 0 15 0
10: 2015 1 1 0 20 0
M
或查看M[, unique(year)]
[1] 2013 2014 2015
setDT(M)[setDT(R)[year %in% M[, unique(year)], -"rain"], on = .(year, month, day, hour, minute)]
year month day hour minute rain
1: 2013 1 1 0 0 0
2: 2013 1 1 0 5 0
3: 2013 1 1 0 10 0
4: 2013 1 1 0 15 0
5: 2013 1 1 0 20 0
6: 2014 1 1 0 0 0
7: 2014 1 1 0 5 0
8: 2014 1 1 0 10 0
9: 2014 1 1 0 15 0
10: 2014 1 1 0 20 0
11: 2015 1 1 0 0 NA
12: 2015 1 1 0 5 NA
13: 2015 1 1 0 10 NA
14: 2015 1 1 0 15 0
15: 2015 1 1 0 20 0
R <- structure(list(year = c(2013L, 2013L, 2013L, 2013L, 2013L, 2014L, 2014L, 2014L, 2014L, 2014L, 2015L, 2015L, 2015L, 2015L, 2015L ), month = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), day = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), hour = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), minute = c(0L, 5L, 10L, 15L, 20L, 0L, 5L, 10L, 15L, 20L, 0L, 5L, 10L, 15L, 20L), rain = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)), .Names = c("year", "month", "day", "hour", "minute", "rain"), row.names = c(NA, -15L), class = "data.frame") M <- structure(list(year = c(2013L, 2013L, 2013L, 2013L, 2013L, 2014L, 2014L, 2014L, 2014L, 2014L, 2015L, 2015L), month = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), day = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), hour = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), minute = c(0L, 5L, 10L, 15L, 20L, 0L, 5L, 10L, 15L, 20L, 15L, 20L), rain = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)), .Names = c("year", "month", "day", "hour", "minute", "rain"), row.names = c(NA, -12L), class = "data.frame")
FROM [Owner Training Report]
WHERE ((([Owner Training Report].[Status Date])>DateAdd('m',12,Date())) AND (([Owner Training Report].[Reporting Basic Status])='Completed'));