所以我有两个系统执行两个基准测试,我从中收集两个指标。
df1 <- data.frame(Benchmark = c("Benchmark1", "Benchmark2"),
Metric1 = c(120, 200),
Metric2 = c(200, 150))
df2 <- data.frame(Benchmark = c("Benchmark1", "Benchmark2"),
Metric1 = c(100, 150),
Metric2 = c(200, 180))
现在我准备这个数据帧用于使用ggplot进行绘图
df <- left_join(df1, df2, by = "Benchmark") %>%
gather(Metric,Value,2:5) %>%
mutate(System = ifelse(grepl(".x", Metric), "System1", "System2"),
Metric = ifelse(grepl("1" , Metric), "Metric1", "Metric2"))
我可以得到一个像这样的好图表
ggplot(df %>% filter(Metric == "Metric1"), aes(x = Benchmark, y = Value, fill = System)) +
geom_col(position = "dodge")
现在,我想为每个系统添加一组具有这些指标几何的新条形图。
我的数据框需要为每个(系统,公制)组合包含2 x 2 = 4个新行,其中包含每个(系统,公制)组合的基准值的几何。
我知道我可以使用base R来选择符合条件的数据框列,获取均值,然后使用bind_rows手动输入新行。有没有更自动化的方法来使用dplyr实现这一目标?也许将group_by()与其他函数组合在一起?
提前致谢。
答案 0 :(得分:2)
你在找这样的东西吗?
Wrangled dataset:
library(dplyr)
library(tidyr)
df2 <- df %>%
group_by(Metric, System) %>%
mutate(GM = gm_mean(Value)) %>%
ungroup() %>%
spread(Benchmark, Value) %>%
gather(x, y, -Metric, -System)
> df2
# A tibble: 12 x 4
Metric System x y
<chr> <chr> <chr> <dbl>
1 Metric1 System1 GM 154.9193
2 Metric1 System2 GM 122.4745
3 Metric2 System1 GM 173.2051
4 Metric2 System2 GM 189.7367
5 Metric1 System1 Benchmark1 120.0000
6 Metric1 System2 Benchmark1 100.0000
7 Metric2 System1 Benchmark1 200.0000
8 Metric2 System2 Benchmark1 200.0000
9 Metric1 System1 Benchmark2 200.0000
10 Metric1 System2 Benchmark2 150.0000
11 Metric2 System1 Benchmark2 150.0000
12 Metric2 System2 Benchmark2 180.0000
计算几何平均值的函数取自this question的接受答案。
Plot(刻面以同时显示Metric1和Metric2):
ggplot(df2,
aes(x = x, y = y, fill = System)) +
geom_col(position = "dodge") +
facet_grid(Metric~.)
答案 1 :(得分:0)
df <- left_join(df1, df2, by = "Benchmark") %>%
gather(Metric,Value,2:5) %>%
mutate(System = ifelse(grepl(".x", Metric), "System1", "System2"),
Metric = ifelse(grepl("1" , Metric), "Metric1", "Metric2"))
df<-df%>%group_by(Benchmark,Metric)%>%
summarise(Value = mean(Value,na.rm=TRUE))%>%
mutate(System = "Mean")%>%
bind_rows(.,df)
ggplot(df %>% filter(Metric == "Metric1"), aes(x = Benchmark, y = Value, fill = System)) +
geom_col(position = "dodge")