我有来自两组的xy
数据,其中每个点还具有相应的xend
和yend
坐标,它们指示从该点开始的箭头的终止位置:
set.seed(1)
df <- data.frame(x=c(rnorm(50,-1,0.5),rnorm(50,1,0.5)),y=c(rnorm(50,-1,0.5),rnorm(50,1,0.5)),group=c(rep("A",50),rep("B",50)))
df$arrow.x.end <- c(df$x[1:50]+runif(50,0,0.25),df$x[51:100]-runif(50,0,0.25))
df$arrow.y.end <- c(df$y[1:50]+runif(50,0,0.25),df$y[51:100]-runif(50,0,0.25))
A组的箭头通常指向B组,反之亦然:
library(ggplot2)
ggplot(df,aes(x=x,y=y,color=group))+geom_point()+theme_minimal()+
geom_segment(aes(x=x,y=y,xend=arrow.x.end,yend=arrow.y.end),arrow=arrow())+
theme(legend.position="none")
我正在寻找一种仅用两个箭头(每组一个)绘制点的方法。 箭头将从每个组的质心开始,具有一个斜率,即每个组的中间斜率。理想情况下,箭头还将具有每组多边形的中间斜率的标准误差。
这是我到目前为止所做的:
library(dplyr)
slope.df <- df %>%
dplyr::group_by(group) %>%
dplyr::mutate(slope=(arrow.y.end-y)/abs((arrow.x.end-x)),length=sqrt((arrow.y.end-y)^2+(arrow.x.end-x)^2)) %>%
dplyr::summarise(slope.median=mean(slope),
slope.median.se=1.2533*(sd(slope)/sqrt(n())),
median.length=median(length),
x.start=median(x),y.start=median(y)) %>%
dplyr::mutate(x.end=x.start+sign(slope.median)*(median.length/sqrt(2))) %>%
dplyr::mutate(y.end=sign(slope.median)*((x.end-x.start)*slope.median))
计算每个箭头的斜率及其长度。然后,每组分别计算中位数斜率,中位数斜率的标准误差和中位数长度。现在,我将中位箭头的xend
和yend
计算为:
median.length^2 <- xend^2 + xend^2
但我还有其他用途。
因此绘制此图:
ggplot(df,aes(x=x,y=y,color=group))+geom_point()+theme_minimal()+theme(legend.position="none")+
geom_segment(aes(x=x.start,y=y.start,xend=x.end,yend=y.end),arrow=arrow(),data=slope.df)
任何建议是否有更好的方法以及如何添加标准误差多边形?
答案 0 :(得分:1)
计算每个周期的x和y的平均值
df2 <- df %>%
select( -c(4,5) ) %>%
mutate( period = 0 ) %>%
rbind( data.frame( x = df$arrow.x.end,
y = df$arrow.y.end,
group = c( rep( "A", 50 ),rep( "B" , 50 ) ),
period = 1)
) %>%
group_by( group, period ) %>%
summarise_all( mean )
# # A tibble: 4 x 4
# # Groups: group [2]
# group period x y
# <fct> <dbl> <dbl> <dbl>
# 1 A 0 -0.950 -1.08
# 2 A 1 -0.816 -0.942
# 3 B 0 1.06 1.04
# 4 B 1 0.940 0.905
绘图,使用stat_smooth在云的“均值”上画一条线
ggplot( data = df2, aes( x = x, y = y, colour = group ) ) +
stat_smooth(se = TRUE, method = lm, fullrange = TRUE) +
geom_point( data = df, aes(x = x, y = y, colour = group, fill = group ) ) +
geom_point( data = df, aes(x = arrow.x.end, y = arrow.y.end, colour = group, fill = group), alpha = 0.5 )