我有以下提到的数据框:
ID Date Value1 Value2 Value3 Value4
XX-12 2018-02-01 15:48:15 XXC 1000 15.45 18
XX-12 2018-02-05 20:18:43 XTR 1500 15.45 12
XX-13 2018-02-03 19:14:17 XRR 1900 18.25 10
XX-13 2018-02-03 22:42:18 XTC 1600 20.25 12
XX-14 2018-02-04 23:14:45 XRY 1100 10.50 10
XX-15 2018-02-05 21:16:48 XTC 1400 20.25 14
从上面的数据框中,我想基于日期时间以及初始值和最终值之间的差值导出初始值(I_Value
)和最终值(F_Value
)。
必需的输出:
ID I_Value1 F_Value1 I_Value2 F_Value2 Diff2 I_vaule3 F_Value3 Diff3 I_Value4 F_Value4 Diff4
XX-12 XXC XTR 1000 1500 500 15.45 15.45 0 18 12 -6
XX-13 XRR XTC 1900 1600 -300 18.25 20.25 2 10 12 2
XX-14 XRY XTC 1100 1100 0 10.50 10.50 0 10 10 0
XX-15 XTC XTC 1400 1400 0 20.25 20.25 0 14 14 0
答案 0 :(得分:2)
使用dplyr
library(dplyr)
df %>%
mutate(Date = as.POSIXct(Date, "%Y-%m-%d %H:%M:%S", tz = "GMT")) %>%
arrange(ID, Date) %>%
group_by(ID) %>%
summarise_at(vars(Value1:Value4), funs(I = first(.),
F = last(.),
Diff = ifelse(is.character(.), NA, last(.) - first(.)))) %>%
select_if(~!all(is.na(.)))
给予
ID Value1_I Value2_I Value3_I Value4_I Value1_F Value2_F Value3_F Value4_F Value2_Diff Value3_Diff Value4_Diff
<chr> <chr> <int> <dbl> <int> <chr> <int> <dbl> <int> <int> <dbl> <int>
1 XX-12 XXC 1000 15.4 18 XTR 1500 15.4 12 500 0 -6
2 XX-13 XRR 1900 18.2 10 XTC 1600 20.2 12 -300 2.00 2
3 XX-14 XRY 1100 10.5 10 XRY 1100 10.5 10 0 0 0
4 XX-15 XTC 1400 20.2 14 XTC 1400 20.2 14 0 0 0
示例数据
df <- structure(list(ID = c("XX-12", "XX-12", "XX-13", "XX-13", "XX-14",
"XX-15"), Date = c("2018-02-01 15:48:15", "2018-02-05 20:18:43",
"2018-02-03 19:14:17", "2018-02-03 22:42:18", "2018-02-04 23:14:45",
"2018-02-05 21:16:48"), Value1 = c("XXC", "XTR", "XRR", "XTC",
"XRY", "XTC"), Value2 = c(1000L, 1500L, 1900L, 1600L, 1100L,
1400L), Value3 = c(15.45, 15.45, 18.25, 20.25, 10.5, 20.25),
Value4 = c(18L, 12L, 10L, 12L, 10L, 14L)), .Names = c("ID",
"Date", "Value1", "Value2", "Value3", "Value4"), class = "data.frame", row.names = c(NA,
-6L))