R两个数据集之间的相关和相关系数

时间:2014-03-17 13:08:06

标签: r correlation

我有两个数据集,一个显示6个国家的schoolenrollment,另一个显示每个国家的GDP。 我想计算每个国家的入学率和GDP之间的相关系数。 我在寻找问题: How can I create a correlation matrix in R?

但我对两个数据集的范围(数据集的行数和列数)有问题......

Schoolenrollemnt数据集:https://drive.google.com/file/d/0B1NJGKqdrgRtTjcySzZOM2xKZU0/edit?usp=sharing

    CountryName year_2000   year_2004   year_2008   year_2012
    Comoros 201899884   362420484   4880000000  6800000000
    Jordan  8457923945  11407566660 54082389393 58768800833
    UAEmirates  104337375343    147824374543    21902892584 36044457920
    Egypt   99838540997 78845185709 840000000   1240000000
    Qatar   17759889598 31675273812 131611819294    210279947256
    Syria   19325894913 25086930693 88882967742 95981572517

gdp数据集:https://drive.google.com/file/d/0B1NJGKqdrgRtRm9SWm9ObGpwbU0/edit?usp=sharing

Indicator   com_2000    com_2004    com_2008    com_2012    Jor_2000    Jor_2004    Jor_2008    Jor_2012    ARE_2000    ARE_2004    ARE_2008    ARE_2012    Egy_2000    Egy_2004    Egy_2008    Egy_2012    Qat_2000    Qat_2004    Qat_2008    Qat_2012    Syr_2000    Syr_2004    Syr_2008    Syr_2012
preprimary (% gross)    2.39124 4.3563  23.68581    24.80515401 31.08014    32.71263    37.38376    33.81492    63.34796    81.92245    91.926025   71.14425    11.94312    15.1121 23.49822    27.3631 29.23454    32.69621    49.64917    73.42391    8.67231 10.00469    9.93459 10.6214
primary (% gross)   116.7763    121.0558    112.08  117.3767    102.3871    106.8326    102.04  98.87783    94.22761    102.304 107.5285    108.3284    101.3365    105.5968    109.9804    108.6207    104.7228    106.0118    104.0118    102.94  107.6219    121.8342    118.0423    122.2586
secondary (% gross) 31.8468 48.04706    60.04706    73.48619    85.90683    91.6662 93.89221    89.05884    45.0041 57.57103    68.905185   72.91143    85.83446    87.64275    89.48275    76.06258    86.4097 110.453 93.25074    12.14547    43.96275    66.56304    72.69195    74.42249
tertiary (% gross)  1.41838 3.00913 6.474124923 11.42145    28.28053    39.41155    44.30046    39.93893    0   0   0   0   31.62423    30.32905    31.64919    28.7532 22.565405   17.80551    11.3693 12.14547    12.00074    15.0151 24.20384    25.63541

X轴必须具有年值(2000,2004,2008,2012),y轴具有注册类型... 对于每个国家/地区,我想要单独的图表,,,,"评论中的图表链接"

代码不是真的,但这是我的开始:

    library(lattice)
        xtest<-read.csv(file.choose(), header=T, sep=",")
ytest<-read.csv(file.choose(), header=F, sep=",")
xvalues<-as.matrix(xtest)
yvalues<-as.matrix(ytest)
corvalue<-cor(xvalues,yvalues)
image(x=seq(dim(xvalues)[2]), y=seq(dim(yvalues)[2]), z=corvalue, xlab="x column", ylab="y column")
text(expand.grid(x=seq(dim(xvalues)[2]), y=seq(dim(yvalues)[2])), labels=round(c(corvalue),2))

作为测试,我采用gdp原始数据集的一个子集xtest:

Comoros Comoros Comoros Comoros
201899884   201899884   201899884   201899884
362420484   362420484   362420484   362420484
4880000000  4880000000  4880000000  4880000000
6800000000  6800000000  6800000000  6800000000

对于scoolenrollment,我收集数据的子集,ytest:

0   2.39124 4.3563  23.68581    24.80515401
99.78652    116.7763    121.0558    112.08  117.3767
0   31.8468 48.04706    60.04706    73.48619
0.82459 1.41838 3.00913 6.474124923 11.42145

有关提高产量的建议吗? 注释中的输出结果:

0 个答案:

没有答案