我的数据框如下:
year<-c("2000","2000","2001","2002","2000")
gender<-c("M","F","M","F","M")
YG<-data.frame(year,gender)
在此数据框中,我要计算每年的“ M”和“ F”数,然后创建一个新的数据框,如:
year M F
1 2000 2 1
2 2001 1 0
3 2002 0 1
我尝试过类似的事情:
library(dplyr)
ns<-YG %>%
group_by(year) %>%
count(YG$gender == "M")
答案 0 :(得分:2)
使用reshape2
的解决方案:
dcast(YG, year~gender)
year F M
1 2000 1 2
2 2001 0 1
3 2002 1 0
或其他tidyverse
解决方案:
YG %>%
group_by(year) %>%
summarise(M = length(gender[gender == "M"]),
F = length(gender[gender == "F"]))
year M F
<fct> <int> <int>
1 2000 2 1
2 2001 1 0
3 2002 0 1
或由@ zx8754提议:
YG %>%
group_by(year) %>%
summarise(M = sum(gender == "M"),
F = sum(gender == "F"))
答案 1 :(得分:1)
我们可以使用count
和spread
来获取df格式,并在fill = 0
中使用spread
来填充0:
library(tidyverse)
YG %>%
group_by(year) %>%
count(gender) %>%
spread(gender, n, fill = 0)
输出:
# A tibble: 3 x 3
# Groups: year [3]
year F M
<fct> <dbl> <dbl>
1 2000 1 2
2 2001 0 1
3 2002 1 0