想象一下这个数据帧:
df <- tibble(
key = c(rep(1, 3), rep(2, 3), rep(3, 3)),
date = rep(Sys.Date(), 9),
hour = rep(c('00', '01', '02'), 3),
value = rep(c(8, 9, 10), 3)
)
我想要输出,以便组摘要列是小时和值的命名列表。对于每个组,就像我要这样做一样:
as.list(setNames(df$value[df$key == 1], df$hour[df$key == 1]))
$`00`
[1] 8
$`01`
[1] 9
$`02`
[1] 10
遵循这些原则,但实际上是有效的:
df %>%
group_by(key, date) %>%
summarise(
daily_value = sum(value),
hourly_values = as.list(setNames(value, hour))
)
也可以使用nest
或类似的tidyr解决方案。
编辑:输出应与此处产生的内容相同:
outputDf <- df %>%
group_by(key, date) %>%
summarise(daily_value = sum(value))
outputDf$hourly_value <- list(
as.list(setNames(df$value[df$key == 1], df$hour[df$key == 1])),
as.list(setNames(df$value[df$key == 2], df$hour[df$key == 2])),
as.list(setNames(df$value[df$key == 3], df$hour[df$key == 3]))
)
outputDf
# A tibble: 3 x 4
# Groups: key [?]
key date daily_value hourly_value
<dbl> <date> <dbl> <list>
1 1 2019-06-18 27 <list [3]>
2 2 2019-06-18 27 <list [3]>
3 3 2019-06-18 27 <list [3]>
outputDf$hourly_value
[[1]]
[[1]]$`00`
[1] 8
[[1]]$`01`
[1] 9
[[1]]$`02`
[1] 10
[[2]]
[[2]]$`00`
[1] 8
[[2]]$`01`
[1] 9
[[2]]$`02`
[1] 10
[[3]]
[[3]]$`00`
[1] 8
[[3]]$`01`
[1] 9
[[3]]$`02`
[1] 10
答案 0 :(得分:2)
我们需要用list
包装,因为summarise
希望每组返回一行。对于as.list
,它将是list
,其中length
与组的行数相同。通过将其包装为list
,我们确保summarise
的长度为1
library(dplyr)
df %>%
group_by(key, date) %>%
summarise(daily_value = sum(value),
hourly_values = list(as.list(setNames(value, hour))))
答案 1 :(得分:0)
df <- tibble(
key = c(rep(1, 3), rep(2, 3), rep(3, 3)),
date = rep(Sys.Date(), 9),
hour = rep(c('00', '01', '02'), 3),
value = rep(c(8, 9, 10), 3)
)
df2 <- df %>%
group_by(key, date) %>%
mutate(daily_value = sum(value),
hourly_value = as.list(value)) #create a list column
names(df2$hourly_value) <- df$hour #give names to the list column