一个变量的排序和绘图以一秒为条件

时间:2017-11-02 20:51:39

标签: r ggplot2

任务:我想通过第二个变量等于1时因子变量与第二个变量等于0时因子变量之间的差异对因子变量重新排序。这是一个可重现的例子来澄清:

# Package
    library(tidyverse)
# Create fake data
    df1 <- data.frame(place = c("A", "B", "C"),
                 avg = c(3.4, 4.5, 1.8))

# Plot, but it's not in order of value
    ggplot(df1, aes(x = place, y = avg)) + 
      geom_point(size = 4)

# Now put it in order
    df1$place <- factor(df1$place, levels = df1$place[order(df1$avg)])

# Plots in order now
    ggplot(df1, aes(x = place, y = avg)) + 
      geom_point(size = 4)

# Adding second, conditional variable (called: new)
    df2 <- data.frame(place = c("A", "A", "B", "B", "C", "C"),
                 new = rep(0:1, 3),
                 avg = c(3.4, 2.3, 4.5, 4.2, 2.1, 1.8))

    ggplot(df2, aes(x = place, y = avg, col = factor(new))) +
      geom_point(size = 3)

目标:我想订购并绘制因子变量的位置,当new为1时的位置与new为0时的位置之间的平均值差异

3 个答案:

答案 0 :(得分:1)

您可以通过以下方式为levels列创建place

library(tidyr)
df2$place <- factor(df2$place, levels=with(spread(df2, new, avg), place[order(`1` - `0`)]))

ggplot(df2, aes(x = place, y = avg, col = factor(new))) +
    geom_point(size = 3) + labs(color = 'new')

给出:

enter image description here

答案 1 :(得分:1)

如果我正确理解目标,那么因子A的差异最大:

avg(new = 0) - avg(new = 1) = 1.1

因此,您可以展开数据框来计算差异,然后收集,然后绘制avgplace的关系,由diff重新排序。或者,如果你想先A,-diff

但如果我没有正确理解,请告诉我。)

df2 %>% 
  spread(new, avg) %>% 
  mutate(diff = `0` - `1`) %>% 
  gather(new, avg, -diff, -place) %>% 
  ggplot(aes(reorder(place, diff), avg)) + 
    geom_point(aes(color =factor(new)), size = 3)

enter image description here

答案 2 :(得分:0)

首先使用dplyr计算列:

df2 %>% group_by(place) %>% mutate(diff=diff(avg))

ggplot(df2, aes(x=place, y=diff, color=diff)+
  geom_point(size=3)