Question

我想编写一个代码来应用函数来计算数据集中列的组合之间的Spearman等级相关性。我有以下数据集：

library(openxlsx)
data <-read.xlsx("e:/LINGUISTICS/mydata.xlsx", 1);

A    B    C    D
go   see  get  eat
see  get  eat  go
get  go   go   get
eat  eat  see  see

函数 cor（rank（x），rank（y），method =“spearman”）仅测量两列之间的相关性，例如A和B之间：

cor(rank(data$A), rank(data$B), method = "spearman")

但我需要计算所有可能的列组合（AB，AC，AD，BC，BD，CD）之间的相关性。我为此写了以下函数：

wert <- function(x, y) { cor(rank(x), rank(y), method = "spearman") }

我不知道如何在我的函数中实现所有可能的列组合（AB，AC，AD，BC，BD，CD）以便自动获得所有结果，因为我的真实数据有更多的列，并且作为具有相关分数的矩阵，例如如下表所示：

    A     B     C     D
A   1     0.3   0.4   0.8
B   0.3   1     0.6   0.5
C   0.4   0.6   1     0.1
D   0.8   0.5   0.1   1

有人能帮助我吗？

Answer 1

您不需要rank。 cor已经计算了与method = "spearman"的Spearman等级相关性。如果您想要data.frame的所有列之间的相关性，只需将data.frame传递给cor，即cor(data, method = "spearman")。你应该学习help("cor")。

如果您想手动执行此操作，请使用combn功能。

PS：您的额外挑战是您实际上有因子变量。无序因子的等级是一个奇怪的概念，但R只是在这里使用整理顺序。由于cor正确地期望数字输入，因此您应首先data[] <- lapply(data, as.integer)。

Answer 2

我认为您只需创建一个函数（pairedcolumns），然后将您的函数（spearman）应用于您提供数据的数据框中的每一对列即可。

#This function works on a data frame (x) usingwhichever other function (fun) you select by making all pairs of columns possible.
pairedcolumns <- function(x,fun) 
{
  n <- ncol(x)##find out how many columns are in the data frame

  foo <- matrix(0,n,n)
  for ( i in 1:n)
  {
    for (j in 1:n)
    {
      foo[i,j] <- fun(x[,i],x[,j])
}
}
 colnames(foo)<-rownames(foo)<-colnames(x)
return(foo)
}

 results<-pairedcolumns(yourdataframe[,2:8], function)

如何在R中使用Spearman等级相关系数的函数？

2 个答案: