ggplot2-x轴刻度和标签与回归线不兼容(geom_smooth和geom_abline)

时间:2019-02-28 12:22:41

标签: r ggplot2 regression axis

我想用以下数据集的xval的回归线和x轴标签绘制yvalxlab的关系:

# Reproduce data
df <- data.frame(xlab = c("C","W","I","Y","F","L","H","V","N","M","R","T","D","G","A","K","Q","S","E","P", "B", NA, "U","Z","X"),
                 xval = c(0.000, 0.004, 0.090, 0.113, 0.117, 0.195, 0.259, 0.263, 0.285, 0.291, 0.394, 0.401, 0.407, 0.437, 0.450, 0.588, 0.665, 0.713, 0.781, 1.000,NA,NA,NA,NA,NA),
                 yval = c(376744, 143848, 796132, 401820, 500313, 1373674,  383024,  981537,  831832,  295145,  910981, 1001490,  910590, 1999530, 1474808, 1001585,  860611, 1510439, 1225631, 1290983, 21, NA, 24, 48, 1034))

详细信息:

> str(df)
'data.frame':   25 obs. of  3 variables:
 $ xlab: Factor w/ 24 levels "A","B","C","D",..: 3 21 9 23 6 11 8 20 13 12 ...
 $ xval: num  0 0.004 0.09 0.113 0.117 0.195 0.259 0.263 0.285 0.291 ...
 $ yval: num  376744 143848 796132 401820 500313 ...

我首先尝试以两种不同的方式使用geom_smooth(),一旦标签令人满意但回归线不满足,在第二种方法中正好相反:

1。标记好,geom_smooth

p1 <- ggplot(df)+
  geom_point(aes(x = as.factor(xval), y = yval))+
  scale_x_discrete(labels = as.character(df$xlab),
                   breaks = df$xval)+
  stat_smooth(method = "lm", 
              data = df, aes(x = xval, y = yval))+ 
  labs(x ="my x-axis title",
       y = "my y-axis title")+
  theme_minimal()+
  theme(axis.text.x = element_text(angle = 0, hjust = 1.1))
p1

p1

2。标记为不好,geom_smooth很好

p2 <- ggplot(df, aes(x = xval, y = yval))+
  geom_point()+
  scale_x_discrete(labels = as.character(df$xlab),
                   breaks = df$xval)+
  stat_smooth(method = "lm")+ 
  labs(x ="my x-axis title",
       y = "my y-axis title")+
  theme_minimal()+
  theme(axis.text.x = element_text(angle = 0, hjust = 1.1))
p2

enter image description here

我进一步尝试了geom_abline来拟合自己的模型。但是,这是x轴点距不等的间距。

3。标签好,geom_abline好,x轴间距不好

# other approach with abline
model.lm.tr <- lm(aa_freq_tr ~ disorderpropensity, data = df)
p3 <- ggplot(df, aes(x = xval, y = yval))+
  geom_point()+
  scale_x_continuous(labels = as.character(df$xlab),
                     breaks = df$xval)+
  geom_abline(intercept = coefficients(model.lm.tr)[1], slope = coefficients(model.lm.tr)[2])+
  labs(x ="my x-axis title",
       y = "my y-axis title")+
  theme_minimal()+
  theme(axis.text.x = element_text(angle = 0, hjust = 1.1, vjust = 0.5))
p3

enter image description here

有人对我如何获得x轴间距和标签为1以及回归线为2(或3也足够)的建议吗?

1 个答案:

答案 0 :(得分:0)

使用stat_smooth()和离散轴的技巧是重新定义stat层的分组。默认情况下,ggplot2假设应在每个离散组(由离散轴定义)内计算统计信息。因此,要覆盖该行为,您可以设置aes(group = 1, ...),即设置一个虚拟组,其中包括绘制的整个数据。

ggplot(df, aes(x = as.factor(xval), y = yval))+
  geom_point() +
  scale_x_discrete(labels = as.character(df$xlab),
                   breaks = df$xval)+
  stat_smooth(method = "lm", aes(group = 1))

enter image description here

为了让示例保持最小,我放弃了labstheme的调整,并将主要的aes()定义移到了初始层以最小化冗余类型。

免责声明:仅因为您可以执行此操作,并不意味着它是可视化数据中模式的最佳方法。仔细考虑您的结论。