关联多个相似数据帧中的多个变量

时间:2014-03-11 21:09:27

标签: r dataframe correlation

我是R和编程本身的新手,他们使用XL和Miner进行数据分析,所以如果问题看起来太基础,请原谅。

我有4个数据框:farm1farm2farm3farm4

`farm1<-structure(list(a = c(-0.700315674269212, 0.174376310290089, -0.802953642024395, 
-0.282317708655969, 0.198528974423857, 0.836114237945342, 0.983599830924647, 
1.14907220855077, -0.471945076669, -0.947783585965569), b = c(-0.0456355425554554, 
-0.301284883241843, 0.460328270868957, -0.496976686442155, -0.0325366991757349, 
0.458486775369624, -0.597532470372807, -0.648309589555456, 2.14749512128352, 
0.245124871567864), c = c(28.4681916252671, 31.5059762466411, 
36.5396753644422, 32.0019564063665, 33.6858689252592, 30.3833642979702, 
31.7212812595004, 33.2019595830279, 33.0727170129226, 31.4977963355712
), d = c(68.8195032459844, 68.3337594834099, 67.4836963601874, 
60.2779662871057, 67.0529412957513, 62.0801084450559, 63.0332790311212, 
57.9849455014888, 61.9213678477396, 51.4985302058811), e = c(5L, 
8L, 8L, 8L, 8L, 7L, 6L, 6L, 8L, 8L), f = c(17L, 12L, 12L, 13L, 
14L, 10L, 13L, 11L, 12L, 13L)), .Names = c("a", "b", "c", "d", 
"e", "f"), row.names = c(NA, -10L), class = "data.frame")`

`farm2<-structure(list(a = c(-0.164523596253587, -0.253361680136508, 
0.696963375404737, 0.556663198673657, -0.68875569454952, -0.70749515696212, 
0.36458196213683, 0.768532924515416, -0.112346212150228, 0.881107726454215
), b = c(-0.568668732818502, -0.135178615123832, 1.1780869965732, 
-1.52356680042976, 0.593946187628422, 0.332950371213518, 1.06309983727636, 
-0.304183923634301, 0.370018809916288, 0.267098790772231), c = c(33.1943176411012, 
30.1639208202477, 33.0233590742733, 28.6119107117576, 36.2990711051031, 
37.9411996955176, 30.8983355706005, 28.8675961210504, 33.7091588823272, 
31.5948361883575), d = c(78.4097065630287, 63.764559983601, 68.1384361747047, 
64.168012952684, 59.5403607467056, 65.1327537970861, 53.1702482266538, 
72.7933291693773, 64.9195200292714, 77.0356700221729), e = c(7L, 
8L, 9L, 9L, 7L, 8L, 9L, 7L, 10L, 7L), f = c(11L, 12L, 13L, 12L, 
12L, 14L, 12L, 15L, 13L, 14L)), .Names = c("a", "b", "c", "d", 
"e", "f"), row.names = c(NA, -10L), class = "data.frame")`

`farm3<-structure(list(a = c(-0.54252003099165, 1.20786780598317, 1.16040261569495, 
0.700213649514998, 1.58683345454085, 0.558486425565304, -1.27659220845804, 
-0.573265414236886, -1.22461261489836, -0.473400636439312), b = c(0.0601604404345152, 
-0.588894486259664, 0.531496192632572, -1.51839408178679, 0.306557860789766, 
-1.53644982353759, -0.300976126836611, -0.528279904445006, -0.652094780680999, 
-0.0568967778473925), c = c(30.1388999683276, 32.1263476194327, 
29.2672350543427, 32.4740863172122, 30.0362460682435, 37.3018618081179, 
34.1501224280516, 34.7305226884857, 33.152556073479, 37.0465282415583
), d = c(60.1855812763061, 61.2301316178366, 72.59369343125, 
60.0958218801378, 62.7557155383882, 61.6431524233481, 62.080042788709, 
62.3253201821406, 66.965129987607, 62.9360171063824), e = c(9L, 
8L, 6L, 9L, 8L, 9L, 8L, 9L, 8L, 6L), f = c(10L, 9L, 12L, 11L, 
12L, 15L, 14L, 12L, 13L, 9L)), .Names = c("a", "b", "c", "d", 
"e", "f"), row.names = c(NA, -10L), class = "data.frame")`

`farm4<-structure(list(a = c(-1.91435942568001, 1.17658331201856, -1.664972436212, 
-0.463530401472386, -1.11592010504285, -0.750819001193448, 2.08716654562835, 
0.0173956196932517, -1.28630053043433, -1.64060553441858), b = c(-1.23132342155804, 
0.983895570053379, 0.219924803660651, -1.46725002909224, 0.521022742648139, 
-0.158754604716016, 1.4645873119698, -0.766081999604665, -0.430211753928547, 
-0.926109497377437), c = c(33.350561303818, 31.9443205018561, 
31.0457948763685, 29.2119135576389, 27.5376190695755, 28.774423110153, 
35.0000864111417, 30.1361999156095, 27.8467194578465, 37.6078718672707
), d = c(66.5506022642347, 62.5681173945218, 70.3508982922541, 
69.3185359082496, 60.2845417106131, 77.2366147872428, 62.4698378191539, 
55.4530320987231, 63.1336023882747, 65.2452300353941), e = c(5L, 
9L, 8L, 8L, 8L, 9L, 8L, 9L, 9L, 7L), f = c(12L, 15L, 10L, 12L, 
7L, 13L, 10L, 15L, 9L, 12L)), .Names = c("a", "b", "c", "d", 
"e", "f"), row.names = c(NA, -10L), class = "data.frame")`

所有这些都在结构上相似。我正在尝试运行相关练习并面临以下问题: 1)在变量'a''b','c','d',e','f'

之间的每个数据框内运行相关

2)在4个数据框中运行上述练习(比较每个数据框中的变量a,b,c,d,e,f),并将结果显示在farm1,farm2,farm3和farm4的表中。我可以指定函数并在所有4个数据帧上立即应用它,而不是运行相同的命令4次吗?

每个数据框架都与一个独特的服务器场相关,无法合并。

我提到了以下帖子,并提到了一些事情,但无法完全解决我的问题 https://stackoverflow.com/search?q=data+frame+correlationCalculate correlation for more than two variables?Calculate Correlations of Pairs of Columns in a Data Frame in RCalculate correlation by aggregating columns of data framePairwise Correlation Table

1 个答案:

答案 0 :(得分:2)

如果没有可重复的例子,很难给出一个好的答案。

  1. 使用mget将图片分组到同一列表中。该列表适用于xxapply函数。我在这里沉思lapply
  2. cor可以应用于矩阵。您按列对数据进行子集化并创建矩阵。这期望您有数字观察,否则您应该使用as.numeric
  3. 进行文件管理

    应用这个,我们可以使用这个代码:

    cols <- letters[6] ## a,b,...f
    lapply(mget(ls(pattern='farm')),
         function(x)cor(as.matrix(x[,cols])))