我想获取满足特定条件的病例数和百分比,并按另一个列分组。
组是城市,条件是hour >= 6
。
例如
city hour
A 7
A 6
A 3
B 2
C 7
我想得到
city hour>=6
A 2
B 1
C 0
,然后是每个病例的百分比(按城市划分)。
city hours >= 6 (%)
A 0.6666667
B 1.0000000
C 0.0000000
City --- hour
我想我快到了
aggregate(hours, list(city), mean)
我按城市得到小时的平均值,但我不知道如何获得其他结果。
MG
答案 0 :(得分:1)
使用软件包dplyr
数据:
df1<-data.frame(city=c(rep("A",3), "B","C"), hour = c(7,6,3,2,7))
代码:
df1 %>% group_by(city) %>% summarise(hourLHE6 = sum(hour <= 6), hourPCT = sum(hour <= 6)/length(hour))
结果:
## A tibble: 3 x 3
# city hourLHE6 hourPCT
# <fct> <int> <dbl>
#1 A 2 0.667
#2 B 1 1
#3 C 0 0
答案 1 :(得分:0)
尝试一下:
x <- structure(list(city = c("A", "A", "A", "B", "C"), hour = c(7,
6, 3, 2, 7)), row.names = c(NA, -5L), class = "data.frame")
> x
city hour
1 A 7
2 A 6
3 A 3
4 B 2
5 C 7
> aggregate(x$hour, by = list(city = x$city), function(z) length(z[z<=6]))
city x
1 A 2
2 B 1
3 C 0
> aggregate(x$hour, by = list(city = x$city), function(z) length(z[z<=6]) / length(z))
city x
1 A 0.6666667
2 B 1.0000000
3 C 0.0000000