我有一个数据框,其中运动员获得的绩效等级为“好”,“一般”和“差”。
我想编写一个执行以下操作的函数:
产生一个新的数据框,其中包含 运动员的名字 运动员获得“好”等级的次数的百分比
Player <- c("Jordan", "Jordan", "Jordan", "Jordan", "Jordan", "Jordan",
"Jordan","Jordan","Jordan", "Barkley", "Barkley", "Barkley", "Barkley",
"Barkley", "Olajuwon", "Olajuwon", "Olajuwon", "Olajuwon", "Olajuwon",
"Kemp", "Kemp", "Kemp", "Kemp", "Kemp", "Kemp")
Rating <- c("Good", "Fair", "Good", "Good", "Good", "Poor", "Good", "Good",
"Good", "Fair", "Fair", "Poor", "Good", "Good", "Good", "Fair", "Good",
"Fair", "Good", "Good", "Good", "Good", "Good", "Good", "Poor")
df <- data.frame(Player, Rating)
我想要的输出是:
Player PercentGood
Jordan 77.8%
Barkley 40.0%
Olajuwon 60.0%
Kemp 83.3%
当我收到文件时,百分比不包括在内,因此我希望每次将更新的文件发送给我时都运行此文件。
文件发送完毕,我应用了代码,并生成了一个新的数据框,该框为我提供了运动员获得“好”等级的百分比摘要
谢谢。
答案 0 :(得分:0)
这是一个tidyverse
解决方案,使用scales::percent
格式化百分比。
它首先产生一个新变量good
或未编码为1或0。然后为每个玩家计算1s的百分比。
library(tidyverse)
library(scales)
df %>% mutate(good = ifelse(Rating == "Good", 1, 0)) %>%
group_by(Player = fct_inorder(Player)) %>%
summarise(PercentGood = percent(mean(good)))
# A tibble: 4 x 2
# Player PercentGood
# <fct> <chr>
#1 Jordan 77.8%
#2 Barkley 40.0%
#3 Olajuwon 60.0%
#4 Kemp 83.3%