我有一个包含3列的df:
我想计算第1列和第2列之间的Spearman相关性测试,但仅在组之间进行计算(因此仅在第1列和第2列的匹配组A的观察值之间计算相关性,同样适用于组B)。 所以我正在使用这些代码行:
cor.test(df$column_1, df$column_2, alternative = ("two.sided"),
subset(df, column_3==c("group_A")),
data = df, method = c("spearm"))
cor.test(df$column_1, df$column_2, alternative = ("two.sided"),
subset(df, column_3==c("group_B")),
data = df, method = c("spearm"))
事实是,我在两个测试中都得到了相同的结果,所以我猜子集函数不起作用,因为如果我之前对子集进行了子集,就像这样:
x <- subset(df, column_3==c("group_A"))
y <- subset(df, column_3==c("group_B"))
然后分别在x和y上运行cor.test
,得到不同的结果。有人知道发生了什么吗?
Warning message:
"In cor.test.default(cor_itir$Nart, cor_itir$Medida, alternative = "two.sided", :cannot compute exact p-value with ties"
答案 0 :(得分:2)
通过使用df$...
提取器并指定data=
并使用subset()
作为独立函数,您过度复杂化了一些事情。您可以使用以下内容获得相同的结果:
# here's some example data with different correlations between each group
df <- data.frame(column_1=1:10,column_2=c(1:5,6,4,3,11,9),column_3=rep(c("a","b"),each=5))
然后只需指定您的论坛,您的data=
和subset=
内联:
cor.test(~ column_1 + column_2, alternative="two.sided", data=df, subset=(column_3=="a"))
cor.test(~ column_1 + column_2, alternative="two.sided", data=df, subset=(column_3=="b"))
或者一次性使用by
by(df, df$column_3, FUN = function(x) cor.test(~ column_1 + column_2, data = x))
答案 1 :(得分:0)
使用with
和subset
:
with(subset(df, column_3==c("group_A")),
cor.test(column_1, column_2, alternative = ("two.sided"),
method = c("spearm")))
with(subset(df, column_3==c("group_B")),
cor.test(column_1, column_2, alternative = ("two.sided"),
method = c("spearm")))
修改
添加数据
df <- data.frame(column_1=1:10,column_2=c(1:5,6,4,3,11,9),column_3=rep(c("group_A","group_B"),each=5))
> with(subset(df, column_3==c("group_A")),
+ cor.test(column_1, column_2, alternative = ("two.sided"),
+ method = c("spearman")))
Spearman's rank correlation rho
data: column_1 and column_2
S = 4.4409e-15, p-value = 0.01667
alternative hypothesis: true rho is not equal to 0
sample estimates:
rho
1
> with(subset(df, column_3==c("group_B")),
+ cor.test(column_1, column_2, alternative = ("two.sided"),
+ method = c("spearman")))
Spearman's rank correlation rho
data: column_1 and column_2
S = 10, p-value = 0.45
alternative hypothesis: true rho is not equal to 0
sample estimates:
rho
0.5