如何获得小时记录的最后价格与第一个价格之间的差异?使用dplyr会很好。
请参阅下面的dput数据集:
structure(list(DATETIME = structure(1:20, .Label = c("2007-05-30 09:41:00",
"2007-05-30 09:45:00", "2007-05-30 10:22:00", "2007-05-30 10:37:00",
"2007-05-30 10:39:00", "2007-05-30 11:25:00", "2007-05-30 13:21:00",
"2007-05-30 14:01:00", "2007-05-31 09:38:00", "2007-05-31 09:56:00",
"2007-05-31 11:02:00", "2007-05-31 11:09:00", "2007-05-31 11:56:00",
"2007-05-31 11:57:00", "2007-05-31 13:42:00", "2007-05-31 14:12:00",
"2007-05-31 14:25:00", "2007-05-31 15:39:00", "2007-05-31 15:48:00",
"2007-05-31 15:55:00"), class = "factor"), MINUTE = c(41L, 45L,
22L, 37L, 39L, 25L, 21L, 1L, 38L, 56L, 2L, 9L, 56L, 57L, 42L,
12L, 25L, 39L, 48L, 55L), HOUR = c(9L, 9L, 10L, 10L, 10L, 11L,
13L, 14L, 9L, 9L, 11L, 11L, 11L, 11L, 13L, 14L, 14L, 15L, 15L,
15L), DAY = c(30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 31L, 31L,
31L, 31L, 31L, 31L, 31L, 31L, 31L, 31L, 31L, 31L), MONTH = c(5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L), YEAR = c(2007L, 2007L, 2007L, 2007L, 2007L, 2007L,
2007L, 2007L, 2007L, 2007L, 2007L, 2007L, 2007L, 2007L, 2007L,
2007L, 2007L, 2007L, 2007L, 2007L), AV.PRICE.BIL = c(45.79, 45.75,
45.79, 45.79, 45.79, 45.79, 45.8, 45.8, 45.79, 45.8, 45.8, 45.8,
45.8, 45.8, 45.8, 45.8, 45.8, 45.8, 45.8, 45.8)), class = "data.frame", row.names = c(NA,
-20L), .Names = c("DATETIME", "MINUTE", "HOUR", "DAY", "MONTH",
"YEAR", "AV.PRICE.BIL"))
需要样本输出:
DATETIME MINUTE HOUR DAY MONTH YEAR AV.PRICE.BIL HOURLY.DIFF
2007-05-30 09:41:00 41 9 30 5 2007 45.79 0
2007-05-30 10:22:00 22 10 30 5 2007 45.79 0
2007-05-30 11:25:00 25 11 30 5 2007 45.79 0
2007-05-30 13:21:00 21 13 30 5 2007 45.79 0
因此,如果有任何缺失时间,它只是从当前小时的最后记录小时中减去观察值。
答案 0 :(得分:2)
first
和last
函数使得这相当简单。
我mutate
和slice
,而不是summarise
,因为您似乎想要保留DATETIME
,MINUTE
等的第一个实例。
df %>%
group_by(YEAR, MONTH, DAY, HOUR) %>%
arrange(MINUTE) %>%
mutate(HOURLY.DIFF = last(AV.PRICE.BIL) - first(AV.PRICE.BIL)) %>%
slice(1)
Source: local data frame [10 x 8] Groups: YEAR, MONTH, DAY, HOUR [10] DATETIME MINUTE HOUR DAY MONTH YEAR AV.PRICE.BIL HOURLY.DIFF <fctr> <int> <int> <int> <int> <int> <dbl> <dbl> 1 2007-05-30 09:41:00 41 9 30 5 2007 45.79 -0.04 2 2007-05-30 10:22:00 22 10 30 5 2007 45.79 0.00 3 2007-05-30 11:25:00 25 11 30 5 2007 45.79 0.00 4 2007-05-30 13:21:00 21 13 30 5 2007 45.80 0.00 5 2007-05-30 14:01:00 1 14 30 5 2007 45.80 0.00 6 2007-05-31 09:38:00 38 9 31 5 2007 45.79 0.01 7 2007-05-31 11:02:00 2 11 31 5 2007 45.80 0.00 8 2007-05-31 13:42:00 42 13 31 5 2007 45.80 0.00 9 2007-05-31 14:12:00 12 14 31 5 2007 45.80 0.00 10 2007-05-31 15:39:00 39 15 31 5 2007 45.80 0.00