在geom_point图上添加带有斜率和误差的箭头

时间:2018-12-05 07:45:29

标签: r ggplot2 polygon arrows

我有来自两组的xy数据,其中每个点还具有相应的xendyend坐标,它们指示从该点开始的箭头的终止位置:

set.seed(1)
df <- data.frame(x=c(rnorm(50,-1,0.5),rnorm(50,1,0.5)),y=c(rnorm(50,-1,0.5),rnorm(50,1,0.5)),group=c(rep("A",50),rep("B",50)))
df$arrow.x.end <- c(df$x[1:50]+runif(50,0,0.25),df$x[51:100]-runif(50,0,0.25))
df$arrow.y.end <- c(df$y[1:50]+runif(50,0,0.25),df$y[51:100]-runif(50,0,0.25))

A组的箭头通常指向B组,反之亦然:

library(ggplot2)
ggplot(df,aes(x=x,y=y,color=group))+geom_point()+theme_minimal()+
  geom_segment(aes(x=x,y=y,xend=arrow.x.end,yend=arrow.y.end),arrow=arrow())+
  theme(legend.position="none")

enter image description here

我正在寻找一种仅用两个箭头(每组一个)绘制点的方法。 箭头将从每个组的质心开始,具有一个斜率,即每个组的中间斜率。理想情况下,箭头还将具有每组多边形的中间斜率的标准误差。

这是我到目前为止所做的:

library(dplyr)
slope.df <- df %>%
  dplyr::group_by(group) %>%
  dplyr::mutate(slope=(arrow.y.end-y)/abs((arrow.x.end-x)),length=sqrt((arrow.y.end-y)^2+(arrow.x.end-x)^2)) %>%
  dplyr::summarise(slope.median=mean(slope),
                   slope.median.se=1.2533*(sd(slope)/sqrt(n())),
                   median.length=median(length),
                   x.start=median(x),y.start=median(y)) %>%
  dplyr::mutate(x.end=x.start+sign(slope.median)*(median.length/sqrt(2))) %>%
  dplyr::mutate(y.end=sign(slope.median)*((x.end-x.start)*slope.median))

计算每个箭头的斜率及其长度。然后,每组分别计算中位数斜率,中位数斜率的标准误差和中位数长度。现在,我将中位箭头的xendyend计算为:

median.length^2 <- xend^2 + xend^2 

但我还有其他用途。

因此绘制此图:

ggplot(df,aes(x=x,y=y,color=group))+geom_point()+theme_minimal()+theme(legend.position="none")+
  geom_segment(aes(x=x.start,y=y.start,xend=x.end,yend=y.end),arrow=arrow(),data=slope.df)

给予: enter image description here

任何建议是否有更好的方法以及如何添加标准误差多边形?

1 个答案:

答案 0 :(得分:1)

计算每个周期的x和y的平均值

df2 <- df %>% 
  select( -c(4,5) ) %>%
  mutate( period = 0 ) %>%
  rbind( data.frame( x = df$arrow.x.end,
                     y = df$arrow.y.end,
                     group = c( rep( "A", 50 ),rep( "B" , 50 ) ),
                     period = 1) 
         ) %>%
  group_by( group, period ) %>%
  summarise_all( mean )

# # A tibble: 4 x 4
# # Groups:   group [2]
#   group period      x      y
#   <fct>  <dbl>  <dbl>  <dbl>
# 1 A          0 -0.950 -1.08 
# 2 A          1 -0.816 -0.942
# 3 B          0  1.06   1.04 
# 4 B          1  0.940  0.905

绘图,使用stat_smooth在云的“均值”上画一条线

ggplot( data = df2, aes( x = x, y = y, colour = group ) ) + 
  stat_smooth(se = TRUE, method = lm, fullrange = TRUE) +
  geom_point( data = df, aes(x = x, y = y, colour = group, fill = group ) ) + 
  geom_point( data = df, aes(x = arrow.x.end, y = arrow.y.end, colour = group, fill = group), alpha = 0.5 )

enter image description here