我正在尝试通过与RStudio一起玩并使用NBA最新赛季的数据制作图表来教自己一些R。在某些图表中有一些重复的播放器数据,我想包括其中一些,而我想排除其中一些。
我的数据集来自https://www.basketball-reference.com/leagues/NBA_2019_per_game.html(我不知道如何直接链接到CSV数据,但是可以在“共享与更多”菜单项下找到该数据集)。将统计数据下载到文件后,我将其导入RStudio ...
> stats <- read.csv("~/Downloads/2018-2019 NBA per game stats.txt")
我做了一个散点图样本...
> ggplot(stats, aes(x=MP,y=FGA)) +geom_point()
但是我注意到,对于球员来说,有很多点是重复的,因为他们在一年中被交易并效力于多支球队。例如,有Ryan Anderson和Trevor Ariza ...
Player Tm MP FGA
Ryan Anderson\anderry01 TOT 322 69
Ryan Anderson\anderry01 PHO 278 60
Ryan Anderson\anderry01 MIA 44 9
OG Anunoby\anunoog01 TOR 1352 404
Trevor Ariza\arizatr01 TOT 2349 736
Trevor Ariza\arizatr01 PHO 884 227
Trevor Ariza\arizatr01 WAS 1465 509
我如何创建一个散点图,其中包括只参加过1个球队(如OG Anunoby)或球员的全年统计数据(Ryan Anderson和Trevor Ariza的TOT线)的球员,但不包括部分赛季的球员(Ryan Anderson和Trevor Ariza的PHO,MIA和WAS系列)?
我想有一种使用某些ifelse
语句的方法
创建一个虚拟变量,或将该信息传递到ggplot
或geom_point
上,但我在网上很难找到类似的例子。
答案 0 :(得分:1)
考虑根据ave
(根据需要的图)添加具有ifelse
(内联计数聚合)和subset
(条件逻辑)的指标列以# NEW COLUMNS
stats$team_count <- with(stats, ave(MP, Player, FUN=length))
stats$tot_indicator <- with(stats, ifelse(team_count == 1, 'TOT', Tm))
# SUBSETTED DATA SCATTERPLOT (ONE TEAM PLAYERS)
ggplot(subset(stats, team_count == 1), aes(x=MP, y=FGA)) + geom_point()
# SUBSETTED DATA SCATTERPLOT (ALL PLAYERS' TOT)
ggplot(subset(stats, tot_indicator == 'TOT'), aes(x=MP, y=FGA)) + geom_point()
主要数据:
.hero {
position: relative;
background: url("images/laptop.png") no-repeat bottom fixed;
-webkit-background-size: cover;
-moz-background-size: cover;
background-size: cover;
text-align: center;
color: #fff;
padding-top: 110px;
min-height: 500px;
letter-spacing: 2px;
font-family: "Montserrat", sans-serif;
}
答案 1 :(得分:1)
1)要创建一个散点图,其中包括只为1个团队(如OG Anunoby)出战的球员:
library(tidyverse)
# first, identify which players play for more than 1 team.
single_team_players <- stats %>%
select(Player) %>%
group_by(Player) %>%
# counts how many teams a player has played for
summarise(count = n()) %>%
# keep only players that have played for 1 team
filter(count == 1)
# then filter out these players from stats
stats_single_team_players <- stats %>%
filter(Player %in% single_team_players$Player)
# create scatterplot
ggplot(stats_single_team_players, aes(x=MP,y=FGA))+
geom_point()+
labs(title = "Single Team Players")
2)创建一个散点图,用于统计球员的全年统计数据(Ryan Anderson和Trevor Ariza的TOT线),而不是部分赛季(Ryan Anderson和Trevor Ariza的PHO,MIA和WAS线)
# filter for single team players OR team = TOT
total_year_stats <- stats %>%
filter((Player %in% single_team_players$Player)|
(Tm == "TOT"))
# graph scatterplot
ggplot(total_year_stats, aes(x=MP,y=FGA)) +
geom_point()+
labs(title = "Total Year Stats")
答案 2 :(得分:0)
使用* {
box-sizing: border-box;
margin: 0;
padding: 0;
}
header{
display: flex;
justify-content: flex-start;
align-items: center;
padding: 10px 10px;
}
nav{
font-family: "Montserrat", sans-serif;
font-weight: 500;
font-size: 16px;
margin-left: 15px;
display: flex;
width: 100%;
}
.nav__links_R, .nav__links_L, a {
text-decoration: none;
list-style: none;
float: left;
color:rgba(0, 0, 0, .50);
}
.nav__links_R {
margin-left: auto;
}
.nav__links_L li {
display: inline-block;
padding: 0px 20px;
}
.nav__links_R li {
display: inline-block;
padding: 0px 20px;
}
删除“ TOT”,然后使用 <header>
<img src="http://placekitten.com/200/40" alt="logo">
<nav>
<ul class=nav__links_L>
<li><a href="#">Home</a></li>
<li><a href="#">About</a></li>
<li><a href="#">Contact</a></li>
</ul>
<ul class=nav__links_R>
<li><a href="">Register</a></li>
<li><a href="">login</a></li>
</ul>
</nav>
</header>
和filter
。然后,您可以在结果数据帧上使用group_by
:
summarize
ggplot
也可以在这里工作,只要“ TOT”始终在实际团队之前即可。
library(tidyverse)
read_table("Player Tm MP FGA
Ryan Anderson\anderry01 TOT 322 69
Ryan Anderson\anderry01 PHO 278 60
Ryan Anderson\anderry01 MIA 44 9
OG Anunoby\anunoog01 TOR 1352 404
Trevor Ariza\arizatr01 TOT 2349 736
Trevor Ariza\arizatr01 PHO 884 227
Trevor Ariza\arizatr01 WAS 1465 509") -> data
data %>%
filter(TM != "TOT") %>%
group_by(Player) %>%
summarize(MP = sum(MP), FGA = sum(FGA))
# A tibble: 3 x 3
Player MP FGA
<chr> <dbl> <dbl>
1 "OG Anunoby\anunoog01" 1352 404
2 "Ryan Anderson\anderry01" 322 69
3 "Trevor Ariza\arizatr01" 2349 736
此外,如果您要处理篮球参考数据,请查看distinct
程序包(https://cran.r-project.org/web/packages/ballr/index.html),该程序包提供了用于与Basketballreference.com进行交互的api。