我在R
中有以下数据框 Name Weekday Block Count
ABC_1 1 5B 12
ABC_1 1 5B 12
ABC_1 1 5C 10
ABC_1 1 5B 10
DER_1 2 5B 10
DER_1 2 5C 10
DER_1 2 5B 10
DER_1 2 5C 10
我希望将数据帧作为输出
Name Weekday Block 5B 5C Cont
ABC_1 1 5B,5B,5C,5B 34 10 12,12,10,10
DER_1 2 5B,5C,5B,5C 20 20 10,10,10,10
我正在使用以下代码来执行此操作。
df_new<- df %>%
group_by(Weekday,Name) %>%
mutate(yard_blocks = paste0(Block, collapse = ",")) %>%
as.data.frame()
但是,它没有给我想要的输出
答案 0 :(得分:2)
按名称&#39;,&#39;工作日&#39;和&#39;阻止&#39;进行分组后,将频率作为列(&#39; n&#39;)然后,通过与“姓名”,“工作日”,“我们mutate
到paste
分组&#39;阻止&#39;阻止&#39;阻止&#39;在新列&#39; Block1&#39;中,获取来自&#39; long&#39;的唯一行(distinct
)和spread
广泛&#39;
library(dplyr)
library(tidyr)
df %>%
group_by(Name, Weekday, Block) %>%
mutate(n = n()) %>%
group_by(Name, Weekday) %>%
mutate(Block1 = toString(Block)) %>%
distinct %>%
spread(Block, n) %>%
rename(Block = Block1)
# A tibble: 2 x 5
# Groups: Name, Weekday [2]
# Name Weekday Block `5B` `5C`
#* <chr> <int> <chr> <int> <int>
#1 ABC_1 1 5B, 5B, 5C, 5B 3 1
#2 DER_1 2 5B, 5C, 5B, 5C 2 2
基于更新的数据集和问题
df %>%
group_by(Name, Weekday) %>%
mutate(Block1 = toString(Block), Cont = toString(Count)) %>%
group_by(Block, add = TRUE) %>%
mutate(Count = sum(Count)) %>%
distinct %>%
spread(Block, Count)
# A tibble: 2 x 6
# Groups: Name, Weekday [2]
# Name Weekday Block1 Cont `5B` `5C`
#* <chr> <int> <chr> <chr> <int> <int>
#1 ABC_1 1 5B, 5B, 5C, 5B 12, 12, 10, 10 34 10
#2 DER_1 2 5B, 5C, 5B, 5C 10, 10, 10, 10 20 20