任务:我想通过第二个变量等于1时因子变量与第二个变量等于0时因子变量之间的差异对因子变量重新排序。这是一个可重现的例子来澄清:
# Package
library(tidyverse)
# Create fake data
df1 <- data.frame(place = c("A", "B", "C"),
avg = c(3.4, 4.5, 1.8))
# Plot, but it's not in order of value
ggplot(df1, aes(x = place, y = avg)) +
geom_point(size = 4)
# Now put it in order
df1$place <- factor(df1$place, levels = df1$place[order(df1$avg)])
# Plots in order now
ggplot(df1, aes(x = place, y = avg)) +
geom_point(size = 4)
# Adding second, conditional variable (called: new)
df2 <- data.frame(place = c("A", "A", "B", "B", "C", "C"),
new = rep(0:1, 3),
avg = c(3.4, 2.3, 4.5, 4.2, 2.1, 1.8))
ggplot(df2, aes(x = place, y = avg, col = factor(new))) +
geom_point(size = 3)
目标:我想订购并绘制因子变量的位置,当new为1时的位置与new为0时的位置之间的平均值差异
答案 0 :(得分:1)
您可以通过以下方式为levels
列创建place
library(tidyr)
df2$place <- factor(df2$place, levels=with(spread(df2, new, avg), place[order(`1` - `0`)]))
ggplot(df2, aes(x = place, y = avg, col = factor(new))) +
geom_point(size = 3) + labs(color = 'new')
给出:
答案 1 :(得分:1)
如果我正确理解目标,那么因子A的差异最大:
avg(new = 0) - avg(new = 1) = 1.1
因此,您可以展开数据框来计算差异,然后收集,然后绘制avg
与place
的关系,由diff
重新排序。或者,如果你想先A,-diff
。
但如果我没有正确理解,请告诉我。)
df2 %>%
spread(new, avg) %>%
mutate(diff = `0` - `1`) %>%
gather(new, avg, -diff, -place) %>%
ggplot(aes(reorder(place, diff), avg)) +
geom_point(aes(color =factor(new)), size = 3)
答案 2 :(得分:0)
首先使用dplyr
计算列:
df2 %>% group_by(place) %>% mutate(diff=diff(avg))
ggplot(df2, aes(x=place, y=diff, color=diff)+
geom_point(size=3)