是否可以在行中获得包含多个值的交叉表,并指定这些行和列的顺序?
我需要的交叉表是频率。
如果我有像
这样的数据框name abbr itemGroup
abcd a g1
abcd a g2
bcde b g1
bcde b g2
abcd a g3
abcd a g1
bcde b g2
bcde b g2
bcde b g2
如何获得这样的结果交叉表?在排序行的位置,按行,每行的总数和列的顺序排序,从左到右,按每列的总数递减。
name abbr g2 g1 g3 total
bcde b 4 1 0 5
abcd a 1 2 1 4
TOTAL 5 3 1
答案 0 :(得分:2)
这是一个复杂的tidyverse
方法。
library(tidyverse) #for dplyr, purrr, tibble, tidyr
df <- tribble(
~name, ~abbr, ~itemGroup,
"abcd", "a", "g1",
"abcd", 'a', "g2",
"bcde", "b", "g1",
"bcde", "b", "g2",
"abcd", "a", "g3",
"abcd", "a", "g1",
"bcde", "b", "g2",
"bcde", "b", "g2",
"bcde", "b", "g2"
)
order <- count(df, name, abbr, itemGroup) %>%
group_by(itemGroup) %>%
summarize(n = sum(n)) %>%
arrange(desc(n)) %>%
pull(itemGroup)
df %>%
count(name, abbr, itemGroup) %>%
spread(itemGroup, n) %>%
left_join(group_by(df, name, abbr) %>%
summarize(total = n())) %>%
bind_rows(summarize_at(., vars(contains("g")), funs(sum), na.rm = TRUE) %>%
mutate(name = "TOTAL")) %>%
map_df(~replace(.x, is.na(.x), "")) %>%
arrange(desc(total)) %>%
select(name, abbr, one_of(order), total)
结果:
# A tibble: 3 x 6
name abbr g2 g1 g3 total
<chr> <chr> <chr> <chr> <chr> <chr>
1 bcde b 4 1 5
2 abcd a 1 2 1 4
3 TOTAL 5 3 1
NA
替换为“”,排列总列并选择其余列的正确顺序。