Question

我有以下数据：

df <- read.table(text =
    "   id    country
    1   IT
    1   IT
    1   USA
    2   USA
    2   FR
    2   IT
    3   USA
    3   USA
    3   IT
    3   FR", header = T)

我需要找到每个ID中每个国家/地区的频率。因此，所需的输出是：

       id  IT  USA  FR
        1   2   1   0
        2   1   1   1
        3   1   2   1

我知道如何使用count（）计算每个ID的行数，但是我不知道如何按每个国家/地区显示。感谢您的帮助！

Answer 1

使用dplyr：

library(dplyr)
df %>% 
  group_by(id) %>%
  count(country) %>% # count having grouped by ids
  spread(country, n) # we spread the values, in order to have long format

# A tibble: 3 x 4
# Groups:   id [3]
     id    FR    IT   USA
  <int> <int> <int> <int>
1     1    NA     2     1
2     2     1     1     1
3     3     1     1     2

如果要将NA替换为0，请执行以下操作：

df %>% 
  group_by(id) %>%
  count(country) %>% 
  spread(country, n) %>% 
  mutate_each(funs(replace(., is.na(.), 0))) # mutate applied for all variables, where we find NA
# A tibble: 3 x 4
# Groups:   id [3]
     id    FR    IT   USA
  <int> <dbl> <dbl> <dbl>
1     1     0     2     1
2     2     1     1     1
3     3     1     1     2

Answer 2

可以用select_at来完成，方法很简单：

mutate_at

输出：

xtabs

如何在R中的组中找到每个班级的频率？

2 个答案: