使用mutate创建一个新变量

时间:2018-04-14 17:38:21

标签: r

我正在使用R中的数据集来处理名为points的变量。数据集包括篮球赛季的每场比赛,变量points分为0,1,2和3分。我已经想出要找出每支球队的投篮命中率。现在,我需要使用mutate()来创建一个名为Totalpoints的新变量,该变量将这些点加起来,同时使point = 2的价值为其频率的2倍,而点= 3则是其频率的3倍。到目前为止,这是我的代码:

Basketball1 <- Basketball %>%
select("TeamName","points") 

Basketball1 %>%
mutate(totalpoints = (0*(Basketball1$points == "0"))+ 
     (2*(Basketball1$points == "2"))+
     (1*(Basketball1$points == "1"))+
     (3*(Basketball1$points == "3"))) 

我需要帮助创建这个新变量,在正确称量它们的同时添加这些点。

1 个答案:

答案 0 :(得分:0)

创建一个可重现的示例,假设您的数据集看起来像或类似的东西。如果您希望获得每个团队的总得分,那么Ben Bolker的答案会很有效。

此外,如果您希望比较每个篮球队产生的不同类型积分的频率,或者想要了解每种类型的积分对总积分的贡献程度,这可能有所帮助。

library(dplyr)
library(ggplot2)

points = c('1', '2', '3', '2', '1')
team_name <- c('rockets','rockets','rockets','rockets', 'rockets')
Basketball1 = as.data.frame(cbind(points, team_name))

# Create Reference column for point types and create a numeric column for multiplication
points_df <- Basketball1 %>% distinct(points) %>% mutate(points_num = as.numeric(points))
Basketball1 %>% group_by(., points, team_name) %>%
  summarize(frequency_count = n()) %>% 
  left_join(points_df) %>% mutate(total_points = frequency_count * points_num) %>%
  select(-points_num) -> Basketball_total_points

# Plot Distribution of total points by Type of point
Basketball_total_points %>% ggplot(aes(team_name, total_points,fill=as.factor(points))) + geom_bar(stat='identity')

# Plot Distribution of frequency by Type of point
Basketball_total_points %>% ggplot(aes(team_name, frequency_count,fill=as.factor(points))) + geom_bar(stat='identity')