我的输入表如下,
+------+------------------+
| Name | Datetime |
+------+------------------+
| ABC | 26-01-2019 4:55 |
| ABC | 26-01-2019 4:35 |
| ABC | 26-01-2019 5:00 |
| XYZ | 26-01-2019 2:50 |
| XYZ | 26-01-2019 4:00 |
| XYZ | 26-01-2019 4:59 |
+------+------------------+
从上表中,我想基于“名称”查找“ DateTime”的最小值和最大值,同时拒绝在“ DataTime”信息之间,并自动创建另一列(如果该人早晚使用R Studio被接纳)如下所示,
+------+------------------+--------+
| Name | Datetime | Col3 |
+------+------------------+--------+
| ABC | 26-01-2019 4:35 | Early |
| ABC | 26-01-2019 5:00 | Late |
| XYZ | 26-01-2019 2:50 | Early |
| XYZ | 26-01-2019 4:59 | Late |
+------+------------------+--------+
谢谢。
答案 0 :(得分:0)
使用dplyr
,一种方法是将DateTime
的列转换为POSIXct
的{{1}},arrange
并选择第一行和最后一行(最小和最大) )添加到每个组中。
Datetime
答案 1 :(得分:0)
这是基本的R选项,
transform(stack(data.frame(
do.call(cbind,
tapply(as.POSIXct(dd$Datetime, format = '%d-%m-%Y %H:%M'), dd$Name, function(i)
as.character(c(min(i), max(i))))), stringsAsFactors = FALSE)),
col3 = c('Early', 'Late'))
# values ind col3
#1 2019-01-26 04:35:00 ABC Early
#2 2019-01-26 05:00:00 ABC Late
#3 2019-01-26 02:50:00 XYZ Early
#4 2019-01-26 04:59:00 XYZ Late
答案 2 :(得分:0)
我们可以使用tidyverse
library(tidyverse)
df %>%
arrange(dmy_hm(Datetime)) %>%
group_by(Name) %>%
filter(row_number() %in% c(1, n())) %>%
mutate(Col3 = c("Early", "Late"))
# A tibble: 4 x 3
# Groups: Name [2]
# Name Datetime Col3
# <chr> <chr> <chr>
#1 XYZ 26-01-2019 2:50 Early
#2 ABC 26-01-2019 4:35 Early
#3 XYZ 26-01-2019 4:59 Late
#4 ABC 26-01-2019 5:00 Late
df <- structure(list(Name = c("ABC", "ABC", "ABC", "XYZ", "XYZ", "XYZ"
), Datetime = c("26-01-2019 4:55", "26-01-2019 4:35", "26-01-2019 5:00",
"26-01-2019 2:50", "26-01-2019 4:00", "26-01-2019 4:59")),
class = "data.frame", row.names = c(NA,
-6L))