我有一个看起来像这样的表:
Year Class Value1 Value2 Value3
2006 A 45 27 96
2007 A 74 45 26
2008 C 74 41 78
2009 D 56 65 45
2010 C 12 14 15
2011 A 25 85 50
2012 B 26 45 12
2013 C 15 23 29
2014 D 86 36 53
如何找到Value1和Value2之间的相关性;所有行的Value1和Value3?
我试图为Value1和Value2执行此操作:
cor <- data[,list(correlation=cor(Value1,Value2)),by=list(Year, Class)]
但是得到错误:
Error in `[.data.frame`(data, , list(correlation = cor(Value1, Value2)), :
unused argument (by = list(Year, Class))
答案 0 :(得分:1)
这是一种返回列表的方法,其中每个列表元素是给定值Class
的相关矩阵。假设您问题中的表格是名为dat
的数据框:
改编自this CrossValidated answer:
library(plyr)
corrFunc <- function(dat) {
return(data.frame(cor(dat[,-c(1,2)])))
}
corr.list = dlply(dat, .(Class), corrFunc)
这是输出的样子:
$A
Value1 Value2 Value3
Value1 1.0000000 -0.5920024 -0.4347386
Value2 -0.5920024 1.0000000 -0.4684250
Value3 -0.4347386 -0.4684250 1.0000000
$B
Value1 Value2 Value3
Value1 NA NA NA
Value2 NA NA NA
Value3 NA NA NA
$C
Value1 Value2 Value3
Value1 1.0000000 0.9580847 0.9855342
Value2 0.9580847 1.0000000 0.9927778
Value3 0.9855342 0.9927778 1.0000000
$D
Value1 Value2 Value3
Value1 1 -1 1
Value2 -1 1 -1
Value3 1 -1 1
attr(,"split_type")
[1] "data.frame"
attr(,"split_labels")
Class
1 A
2 B
3 C
4 D