这是我的数据集:
ID A B Y Time
1 1 0 1 1
1 1 0 4 2
...
1 1 0 7 10
2 1 1 3 1
...
如果A和B是二分的(在ID内没有变化),Y是连续的,每个ID的时间从1到10。
我正在尝试绘制四条线(在同一图表中):
当A = 0且B = 0时为Y,当A = 0且B = 1时为Y,当A = 1且B = 0时为Y,当A = 1且B = 1时为Y
,X轴为时间。
我计算了当A = 0,B = 0,T = 1时的平均Y,然后当A = 0,B = 0,T = 2时计算Y ...但是效率不高。
绘制四条线的最佳方法是什么?
答案 0 :(得分:1)
以下是使用aggregate
和ggplot2
的一种方式:
set.seed(123)
df1 <- data.frame(ID = rep(c(1:5), each = 10),
A = rep(c(0,0,1,1,0), each = 10),
B = rep(c(0,1,0,1,1), each = 10),
Y = rnorm(50),
Time = rep(1:10, 5))
aggregate
df1_agg <- aggregate(Y ~ Time + A + B, data = df1, mean)
#add AB column
df1_agg$AB <- paste('A =', df1_agg$A, 'B =', df1_agg$B)
head(df1_agg) #what does it look like?
Time A B Y AB
1 1 0 0 -0.56047565 A = 0 B = 0
2 2 0 0 -0.23017749 A = 0 B = 0
3 3 0 0 1.55870831 A = 0 B = 0
4 4 0 0 0.07050839 A = 0 B = 0
5 5 0 0 0.12928774 A = 0 B = 0
6 6 0 0 1.71506499 A = 0 B = 0
ggplot2
library(ggplot2)
ggplot(data = df1_agg, aes(x = Time, y = Y, colour = AB))+
geom_line()