我正在使用R中的数据集来处理名为points
的变量。数据集包括篮球赛季的每场比赛,变量points
分为0,1,2和3分。我已经想出要找出每支球队的投篮命中率。现在,我需要使用mutate()
来创建一个名为Totalpoints
的新变量,该变量将这些点加起来,同时使point = 2
的价值为其频率的2倍,而点= 3则是其频率的3倍。到目前为止,这是我的代码:
Basketball1 <- Basketball %>%
select("TeamName","points")
Basketball1 %>%
mutate(totalpoints = (0*(Basketball1$points == "0"))+
(2*(Basketball1$points == "2"))+
(1*(Basketball1$points == "1"))+
(3*(Basketball1$points == "3")))
我需要帮助创建这个新变量,在正确称量它们的同时添加这些点。
答案 0 :(得分:0)
创建一个可重现的示例,假设您的数据集看起来像或类似的东西。如果您希望获得每个团队的总得分,那么Ben Bolker的答案会很有效。
此外,如果您希望比较每个篮球队产生的不同类型积分的频率,或者想要了解每种类型的积分对总积分的贡献程度,这可能有所帮助。
library(dplyr)
library(ggplot2)
points = c('1', '2', '3', '2', '1')
team_name <- c('rockets','rockets','rockets','rockets', 'rockets')
Basketball1 = as.data.frame(cbind(points, team_name))
# Create Reference column for point types and create a numeric column for multiplication
points_df <- Basketball1 %>% distinct(points) %>% mutate(points_num = as.numeric(points))
Basketball1 %>% group_by(., points, team_name) %>%
summarize(frequency_count = n()) %>%
left_join(points_df) %>% mutate(total_points = frequency_count * points_num) %>%
select(-points_num) -> Basketball_total_points
# Plot Distribution of total points by Type of point
Basketball_total_points %>% ggplot(aes(team_name, total_points,fill=as.factor(points))) + geom_bar(stat='identity')
# Plot Distribution of frequency by Type of point
Basketball_total_points %>% ggplot(aes(team_name, frequency_count,fill=as.factor(points))) + geom_bar(stat='identity')