Question

我有以下数据与长格式的配对观察。我试图在长格式的R中沿着时间变量进行配对t检验，但是首先检测在时间1和2中都不可用的obs（在这种情况下为obs B和E），然后可能创建一个按顺序观察的新数据帧。有没有办法在不首先将数据重新整理成宽格式的情况下执行此操作？感谢帮助和建议，R newbie here。

obs time value
A   1    5.5
B   1    7.1
C   1    4.3
D   1    6.4
E   1    6.6
F   1    5.6
G   1    6.6
A   2    6.5
C   2    6.7
D   2    7.8
F   2    5.7
G   2    8.9

Answer 1

作为在@ CPak的长格式答案中使用重复的替代方法，您可以通过观察和过滤来分组观察计数不等于1的位置：

library(dplyr)

p = 
  group_by(df, obs) %>%
  filter(n() != 1) %>%
  arrange(time, obs) %>%
  ungroup()

在任何情况下都会导致相同的结果，就像应用@ CPak答案中所示的t检验一样：

ans <- with(p, t.test(value ~ time, paired=TRUE))

> ans

    Paired t-test

data:  value by time
t = -3.3699, df = 4, p-value = 0.02805
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -2.6264228 -0.2535772
sample estimates:
mean of the differences 
                  -1.44

Answer 2

您可以在前进和后退false方向使用duplicated来过滤数据

fromLast=TRUE

然后执行配对的t.test

library(dplyr)
p <- df %>%
       filter(duplicated(obs) | duplicated(obs, fromLast=TRUE)) %>%
       arrange(time, obs)

   # obs time value
# 1    A    1   5.5
# 2    C    1   4.3
# 3    D    1   6.4
# 4    F    1   5.6
# 5    G    1   6.6
# 6    A    2   6.5
# 7    C    2   6.7
# 8    D    2   7.8
# 9    F    2   5.7
# 10   G    2   8.9

您的原始数据

ans <- with(p, t.test(value ~ time, paired=TRUE))

        # Paired t-test

# data:  value by time
# t = -3.3699, df = 4, p-value = 0.02805
# alternative hypothesis: true difference in means is not equal to 0
# 95 percent confidence interval:
 # -2.6264228 -0.2535772
# sample estimates:
# mean of the differences 
                  # -1.44

R：配对t.test去除没有对的观察（长格式）

2 个答案: