我一直在使用tidyr中的unnest()函数来查找包含日期列表的列。
x <- seq(from= as.POSIXct('2011-01-01 14:00:00'),length.out=100,by = "hour")
y <- seq(from= as.POSIXct('2012-01-01 14:00:00'),length.out=100,by = "hour")
df <- data.frame(x,y)
当我尝试为每一行创建一个列表,然后将其删除。我收到以下错误。
df %>% rowwise() %>% mutate(sequence = list(seq.POSIXt(x,y,"10 min"))) %>% unnest(sequence)
错误:每列必须是向量列表或数据帧列表[sequence]
其他人可以帮忙吗?我用数字完成了这个,并且不需要的功能正常工作。但是,它似乎不适用于包含日期/日期时间的列表。
答案 0 :(得分:1)
将seq.POSIXt()
的结果强制转换为数据框并列出该列表......
x <- seq(from= as.POSIXct('2011-01-01 14:00:00'),length.out=100,by = "hour")
y <- seq(from= as.POSIXct('2012-01-01 14:00:00'),length.out=100,by = "hour")
df <- data.frame(x,y)
library(dplyr)
library(tidyr)
df %>%
rowwise() %>%
mutate(sequence = list(data.frame(seq.POSIXt(x, y, "10 min")))) %>%
unnest(sequence)
# # A tibble: 5,256,100 x 3
# x y seq.POSIXt.x..y...10.min..
# <dttm> <dttm> <dttm>
# 1 2011-01-01 14:00:00 2012-01-01 14:00:00 2011-01-01 14:00:00
# 2 2011-01-01 14:00:00 2012-01-01 14:00:00 2011-01-01 14:10:00
# 3 2011-01-01 14:00:00 2012-01-01 14:00:00 2011-01-01 14:20:00
# 4 2011-01-01 14:00:00 2012-01-01 14:00:00 2011-01-01 14:30:00
# 5 2011-01-01 14:00:00 2012-01-01 14:00:00 2011-01-01 14:40:00
# 6 2011-01-01 14:00:00 2012-01-01 14:00:00 2011-01-01 14:50:00
# 7 2011-01-01 14:00:00 2012-01-01 14:00:00 2011-01-01 15:00:00
# 8 2011-01-01 14:00:00 2012-01-01 14:00:00 2011-01-01 15:10:00
# 9 2011-01-01 14:00:00 2012-01-01 14:00:00 2011-01-01 15:20:00
# 10 2011-01-01 14:00:00 2012-01-01 14:00:00 2011-01-01 15:30:00
# # ... with 5,256,090 more rows
答案 1 :(得分:0)
如果我没有记错,data.frame
并不支持列表列。
尝试将df <- data.frame(x,y)
替换为df <- tibble::tibble(x, y)
library(dplyr)
library(tidyr)
x <- seq(from= as.POSIXct('2011-01-01 14:00:00'),length.out=100,by = "hour")
y <- seq(from= as.POSIXct('2012-01-01 14:00:00'),length.out=100,by = "hour")
df <- tibble::tibble(x,y)
df %>% rowwise() %>% mutate(sequence = list(seq.POSIXt(x,y,"10 min"))) %>% unnest(sequence)
#> # A tibble: 5,256,100 x 3
#> x y sequence
#> <dttm> <dttm> <dttm>
#> 1 2011-01-01 14:00:00 2012-01-01 14:00:00 2011-01-01 14:00:00
#> 2 2011-01-01 14:00:00 2012-01-01 14:00:00 2011-01-01 14:10:00
#> 3 2011-01-01 14:00:00 2012-01-01 14:00:00 2011-01-01 14:20:00
#> 4 2011-01-01 14:00:00 2012-01-01 14:00:00 2011-01-01 14:30:00
#> 5 2011-01-01 14:00:00 2012-01-01 14:00:00 2011-01-01 14:40:00
#> 6 2011-01-01 14:00:00 2012-01-01 14:00:00 2011-01-01 14:50:00
#> 7 2011-01-01 14:00:00 2012-01-01 14:00:00 2011-01-01 15:00:00
#> 8 2011-01-01 14:00:00 2012-01-01 14:00:00 2011-01-01 15:10:00
#> 9 2011-01-01 14:00:00 2012-01-01 14:00:00 2011-01-01 15:20:00
#> 10 2011-01-01 14:00:00 2012-01-01 14:00:00 2011-01-01 15:30:00
#> # ... with 5,256,090 more rows
答案 2 :(得分:0)
我无法重现错误,但认为替代方法可能会有所帮助。
library(dplyr)
library(tidyr)
df %>%
rowwise() %>%
mutate(sequence = paste(seq.POSIXt(x, y, "10 min"), collapse=",")) %>%
ungroup() %>%
separate_rows(sequence, sep=",") %>%
mutate(sequence = as.POSIXct(sequence))
OR
如果您想使用unnest
,那么
df %>%
rowwise() %>%
mutate(sequence = list(seq.POSIXt(x, y, "10 min"))) %>%
ungroup() %>%
unnest(sequence)
输出为:
x y sequence
<dttm> <dttm> <dttm>
1 2011-01-01 14:00:00 2011-01-02 14:00:00 2011-01-01 14:00:00
2 2011-01-01 14:00:00 2011-01-02 14:00:00 2011-01-01 14:10:00
3 2011-01-01 14:00:00 2011-01-02 14:00:00 2011-01-01 14:20:00
4 2011-01-01 14:00:00 2011-01-02 14:00:00 2011-01-01 14:30:00
5 2011-01-01 14:00:00 2011-01-02 14:00:00 2011-01-01 14:40:00
...
示例数据:
df <- structure(list(x = structure(c(1293870600L, 1293874200L, 1293877800L,
1293881400L, 1293885000L, 1293888600L, 1293892200L, 1293895800L,
1293899400L, 1293903000L), class = c("POSIXct", "POSIXt"), tzone = ""),
y = structure(c(1293957000L, 1293960600L, 1293964200L, 1293967800L,
1293971400L, 1293975000L, 1293978600L, 1293982200L, 1293985800L,
1293989400L), class = c("POSIXct", "POSIXt"), tzone = "")), .Names = c("x",
"y"), row.names = c(NA, -10L), class = "data.frame")