我有一些运动员身体活动数据,我正在根据历史数据绘制一些当前数据。 I' m使用的两个数据集如下。
历史数据 - quartertwo2017
``` r
Player.Name Date Distance HIR V6
Player 1 10/9/17 7060.621 2506.20 12.50
Player 1 15/7/17 4978.625 1596.19 44.26
Player 1 2/7/17 6787.667 2048.61 39.67
Player 1 22/7/17 6881.126 2065.80 31.48
Player 1 24/6/17 5802.060 2204.87 65.48
Player 1 29/7/17 7035.075 2085.32 22.56
Player 1 3/9/17 7016.175 2659.18 66.14
Player 1 5/8/17 6137.929 2154.36 25.49
Player 1 9/6/17 5515.685 2054.66 189.55
Player 1 9/7/17 6311.515 2144.63 20.54
Player 2 1/4/17 7150.221 2307.78 233.88
Player 2 10/9/17 8115.131 3136.33 217.86
Player 2 13/5/17 6391.008 2325.89 101.85
Player 2 15/7/17 6919.630 2136.40 118.64
Player 2 17/6/17 6366.357 2177.28 189.09
Player 2 19/8/17 7230.393 2530.59 104.58
Player 2 2/7/17 6620.122 1908.88 36.34
Player 2 20/5/17 7335.201 2250.34 152.84
Player 2 22/4/17 6956.030 2483.05 376.06
Player 2 22/7/17 7643.874 2370.89 172.20
Player 2 24/3/17 4258.366 1447.50 195.18
Player 2 24/6/17 7305.026 2771.67 297.99
Player 2 26/8/17 8024.780 2867.62 318.08
Player 2 27/5/17 6714.186 2409.16 125.31
Player 2 28/4/17 7106.519 2832.97 337.05
Player 2 29/7/17 8693.820 1961.28 27.80
Player 2 3/9/17 8005.006 2741.90 139.24
Player 2 5/8/17 7676.653 2475.58 111.07
Player 2 9/6/17 7176.619 2645.06 137.82
Player 2 9/7/17 7946.231 3140.44 126.59
#> Error: <text>:1:16: unexpected symbol
#> 1: Player.Name Date
#> ^
```
&#13;
当前数据 - quartertwo2018
``` r
Player.Name Date Distance HIR V6
Player 1 2/3/18 5234.390 1513.73 41.82
Player 2 2/3/18 6352.987 2054.94 166.72
#> Error: <text>:1:15: unexpected symbol
#> 1: Player.Name Date
#> ^
```
&#13;
具体来说,我使用geom_point
绘制运动员所覆盖的当前总距离与他们通常使用geom_boxplot
所覆盖的距离。我到目前为止的代码如下:
plot_TD_Q2 <- ggplot(data = quartertwo2017, aes(x = Player.Name, y = Distance)) +
geom_boxplot(fill = "light blue") +
coord_flip() +
ggtitle("Quarter 2") +
xlab("Player") +
ylab("Total Distance") +
theme_classic()
plot_TD_Q2 <- plot_TD_Q2 + geom_point(data = quartertwo2018, aes(x = Player.Name, y = Distance),
position = position_jitter(width = 0.5),
col = "red",
cex = 3)
此代码带来的输出让我非常满意。但是,我想知道是否可以根据z分数计算改变箱线图的颜色。
例如,如果运动员目前的情况,我希望盒子图的颜色变红。总距离(geom_point
)与其平均历史数据相差(>)3 SD。此外,如果运动员当前的总距离在1到2.99 SD之间,则箱线图将变为琥珀色,如果它在1 SD内,则将填充为绿色。
我的历史数据是从数据集quartertwo2017
中提取的,而我的当前数据是从数据集quartertwo2018
中提取的。数据为quartertwo2018
。因此,x =从quartertwo2017
得到的当前总距离与 import javax.ws.rs.client.Client;
import javax.ws.rs.client.ClientBuilder;
import javax.ws.rs.client.Entity;
import javax.ws.rs.core.Response;
import javax.ws.rs.core.MediaType;
Client client = ClientBuilder.newClient();
Entity payload = Entity.json("{ 'image': 'http://media.kairos.com/kairos-
elizabeth.jpg', 'subject_id': 'Elizabeth', 'gallery_name': 'MyGallery'}");
Response response = client.target("https://api.kairos.com/enroll")
.request(MediaType.APPLICATION_JSON_TYPE)
.header("app_id", "4985f625")
.header("app_key", "aa9e5d2ec3b00306b2d9588c3a25d68e")
.post(payload);
System.out.println("status: " + response.getStatus());
System.out.println("headers: " + response.getHeaders());
System.out.println("body:" + response.readEntity(String.class));
的均值和标准差。
我希望我的问题有道理。理解这可能有点先进,特别是因为我仍然认为自己是R的新手。非常感谢任何帮助,如果需要更多信息,请告诉我。我是在Stack Overflow上发布的新手,所以希望我能正确地编译这个问题。
谢谢。
答案 0 :(得分:0)
考虑通过汇总sd
的历史数据来计算 z_score ,然后合并到当前数据并使用ifelse
有条件地分配新列。然后可以在aes()
的颜色框架中使用此新列:
aggdf <- setNames(aggregate(Distance~Player.Name, quartertwo2017, sd),
c("Player.Name", "Distance_sd"))
quartertwo2018 <- merge(quartertwo2018, aggdf, by="Player.Name")
quartertwo2018$z_score <- ifelse(quartertwo2018$Distance > (3*sd(quartertwo2018$Distance_sd)),
'high',
ifelse(quartertwo2018$Distance < (3*sd(quartertwo2018$Distance_sd))
& quartertwo2018$Distance > (1*sd(quartertwo2018$Distance_sd)),
'med',
'low'))
plot_TD_Q2 <- ggplot(data = quartertwo2017,
aes(x = Player.Name, y = Distance)) +
geom_boxplot(fill = "light blue") +
coord_flip() + ggtitle("Quarter 2") +
xlab("Player") + ylab("Total Distance") + theme_classic() +
geom_point(data = quartertwo2018,
aes(x = Player.Name, y = Distance, colour = z_score),
position = position_jitter(width = 0.5),
cex = 3) +
# RED, ORANGE/RED, GREEN BY HEX COLOR CODE
scale_color_manual(values=c("#FF0000", "#FF6600", "#339900"))
plot_TD_Q2
输出 (看起来类似于你的,因为发布数据中的两个玩家共享红色类别)