我有一个数据框,其中包含一个ID列和多个要汇总的列。在每个列(互斥)中,我要计算与“ a”,“ b”或两者之一匹配的行。
IOException
据我所知:
IOException
我采用正确的方法吗?我正在尝试获取如下所示的内容:
> df
# A tibble: 5 x 3
id col1 col2 col3
<dbl> <chr> <chr> <chr>
1 1 NA b NA
2 2 NA b NA
3 3 NA NA a
4 4 b NA NA
5 5 a NA NA
答案 0 :(得分:1)
您可以尝试:
library(tidyverse)
df %>%
gather(key, value, -id) %>%
group_by(key, value) %>%
count %>%
filter(!is.na(value))
# A tibble: 4 x 3
# Groups: key, value [4]
key value n
<chr> <chr> <int>
1 col1 a 1
2 col1 b 1
3 col2 b 2
4 col3 a 1
如果您希望将表格结果编辑成问题,则可以执行以下操作:
df %>%
gather(key, value, -id) %>%
group_by(key, value) %>%
count %>%
filter(!is.na(value)) %>%
group_by(key) %>%
mutate(x = sum(n)) %>%
spread(value, n, fill = 0)
# A tibble: 3 x 4
# Groups: key [3]
key x a b
<chr> <int> <dbl> <dbl>
1 col1 2 1 1
2 col2 2 0 2
3 col3 1 1 0
答案 1 :(得分:1)
一种tidyverse
可能是:
df %>%
gather(var, letters, -id, na.rm = TRUE) %>%
add_count(var, letters, name = "n_letters") %>%
add_count(var, name = "n_all") %>%
select(-id) %>%
distinct()
var letters n_letters n_all
<chr> <chr> <int> <int>
1 col1 b 1 2
2 col1 a 1 2
3 col2 b 2 2
4 col3 a 1 1
或者:
df %>%
gather(var, letters, -id, na.rm = TRUE) %>%
add_count(var, letters, name = "n_letters") %>%
add_count(var, name = "all") %>%
select(-id) %>%
distinct() %>%
spread(letters, n_letters, fill = 0)
var all a b
<chr> <int> <dbl> <dbl>
1 col1 2 1 1
2 col2 2 0 2
3 col3 1 1 0