我有一个像这样的数据框(df):
TIMESTAMP STATUS
2016-01-01 00:00:00 OFF
2016-01-01 01:00:00 ON
2016-01-01 02:00:00 ON
2016-01-01 03:00:00 OFF
2016-01-02 00:00:00 ON
2016-01-02 01:00:00 OFF
...
我需要聚合(?)每天的状态序列。例如,df中的第一天给出序列OFF-ON-ON-OFF,而第二天给出OFF-ON
所以我需要一个按日期汇总的数据框:
DAY SEQUENCE
2016-01-01 OFF-ON-ON-OFF
2016-01-02 ON-OFF
...
答案 0 :(得分:1)
library(dplyr)
df %>%
arrange(TIMESTAMP) %>%
mutate(date = as.Date(TIMESTAMP)) %>%
group_by(date) %>%
summarise(sequence = paste(status, collapse = "-"))
数据强>
df <- data.frame(
TIMESTAMP = c("2016-01-01 00:00:00", "2016-01-01 01:00:00", "2016-01-01 02:00:00", "2016-01-01 03:00:00", "2016-01-02 00:00:00", "2016-01-02 01:00:00"),
status = c("OFF", "ON", "ON", "OFF", "ON", "OFF")
)
答案 1 :(得分:1)
按照传统,我会在这里添加一个data.table
解决方案:
library(data.table)
library(lubridate)
s <- "TIMESTAMP, STATUS
2016-01-01 00:00:00, OFF
2016-01-01 01:00:00, ON
2016-01-01 02:00:00, ON
2016-01-01 03:00:00, OFF
2016-01-02 00:00:00, ON
2016-01-02 01:00:00, OFF"
dt <- fread(s)
dt[, day_time := ymd_hms(TIMESTAMP)]
# better to make sure the events is in right order
setorder(dt, day_time)
dt[, DAY := date(day_time)]
dt[, paste0(STATUS, collapse = "-"), by = DAY]
答案 2 :(得分:0)
根据您想要的结果,我假设您也想要删除时间戳。如果是这种情况,您可以使用aggregate,as.Date和基础R的粘贴。
df <- data.frame(TIMESTAMP =
c('2016-01-01 00:00:00','2016-01-01 01:00:00',
'2016-01-01 02:00:00','2016-01-01 03:00:00',
'2016-01-02 00:00:00','2016-01-02 01:00:00'),
STATUS = c('OFF','ON','ON','OFF','ON','OFF'))
aggregate(df$STATUS, list(as.Date(df$TIMESTAMP)), paste, collapse="-")
## Group.1 x
## 2016-01-01 OFF-ON-ON-OFF
## 2016-01-02 ON-OFF