我有以下数据框:
[] Group State County Deaths
[1] 01 Nicaragua County A 0
[2] 01 Nicaragua County B 13
[3] 01 Nicaragua County C 0
[4] 02 Mexico County D 0
[5] 02 Mexico County F 4
[6] 02 Mexico County E 0
我想从同一组(其中死亡为0 )中计算所有案例,然后将进度条添加为新列。理想的结果是这样的:
[] Group State County Deaths Counties.without.Deaths
[1] 01 Nicaragua County A 0 2
[2] 01 Nicaragua County B 13 2
[3] 01 Nicaragua County C 0 2
[4] 02 Mexico County D 0 3
[5] 02 Mexico County F 0 3
[6] 02 Mexico County E 0 3
是否有特定功能?我尝试使用循环,但是作为一个初学者,失败了。感谢您的帮助!
答案 0 :(得分:0)
类似的东西:
library(dplyr)
df <- df %>%
group_by(Group) %>%
mutate(Counties.without.Deaths = sum(Deaths == 0))
您也可以使用sum
代替length(Deaths[Deaths == 0])
,但是它可能会稍微慢一些。
您也可以在base
中完成此操作,而无需其他软件包;这将是最快的选择:
df$Counties.without.Deaths <- with(df, ave(Deaths, Group, FUN = function(x) sum(x == 0)))
一个快速的基准测试表明,base
选项的速度几乎可以提高10倍:
Unit: microseconds
expr min lq mean median uq max neval
dplyr 1056.020 1091.3915 1267.1185 1121.2920 1318.019 2294.364 100
base 113.771 132.9145 182.4703 149.6885 170.291 2769.136 100
dplyr
和base
的输出:
Group State County Deaths Counties.without.Deaths
1 1 Nicaragua County A 0 2
2 1 Nicaragua County B 13 2
3 1 Nicaragua County C 0 2
4 2 Mexico County D 0 3
5 2 Mexico County F 0 3
6 2 Mexico County E 0 3
答案 1 :(得分:0)
merge(df, aggregate(Deaths ~ Group, df, FUN = function(x) sum(x == 0)), by = "Group", suffixes = c("", "counties.without"))
Group State County Deaths Deathscounties.without
1 1 Nicaragua County A 0 2
2 1 Nicaragua County B 13 2
3 1 Nicaragua County C 0 2
4 2 Mexico County D 0 3
5 2 Mexico County F 0 3
6 2 Mexico County E 0 3
数据:
df <- structure(list(Group = c(1L, 1L, 1L, 2L, 2L, 2L), State = c("Nicaragua",
"Nicaragua", "Nicaragua", "Mexico", "Mexico", "Mexico"), County = c("County A",
"County B", "County C", "County D", "County F", "County E"),
Deaths = c(0L, 13L, 0L, 0L, 0L, 0L)), row.names = c(NA, -6L
), class = "data.frame")