我有两个数据集,一个显示6个国家的schoolenrollment,另一个显示每个国家的GDP。 我想计算每个国家的入学率和GDP之间的相关系数。 我在寻找问题: How can I create a correlation matrix in R?
但我对两个数据集的范围(数据集的行数和列数)有问题......
Schoolenrollemnt数据集:https://drive.google.com/file/d/0B1NJGKqdrgRtTjcySzZOM2xKZU0/edit?usp=sharing
CountryName year_2000 year_2004 year_2008 year_2012
Comoros 201899884 362420484 4880000000 6800000000
Jordan 8457923945 11407566660 54082389393 58768800833
UAEmirates 104337375343 147824374543 21902892584 36044457920
Egypt 99838540997 78845185709 840000000 1240000000
Qatar 17759889598 31675273812 131611819294 210279947256
Syria 19325894913 25086930693 88882967742 95981572517
gdp数据集:https://drive.google.com/file/d/0B1NJGKqdrgRtRm9SWm9ObGpwbU0/edit?usp=sharing
Indicator com_2000 com_2004 com_2008 com_2012 Jor_2000 Jor_2004 Jor_2008 Jor_2012 ARE_2000 ARE_2004 ARE_2008 ARE_2012 Egy_2000 Egy_2004 Egy_2008 Egy_2012 Qat_2000 Qat_2004 Qat_2008 Qat_2012 Syr_2000 Syr_2004 Syr_2008 Syr_2012
preprimary (% gross) 2.39124 4.3563 23.68581 24.80515401 31.08014 32.71263 37.38376 33.81492 63.34796 81.92245 91.926025 71.14425 11.94312 15.1121 23.49822 27.3631 29.23454 32.69621 49.64917 73.42391 8.67231 10.00469 9.93459 10.6214
primary (% gross) 116.7763 121.0558 112.08 117.3767 102.3871 106.8326 102.04 98.87783 94.22761 102.304 107.5285 108.3284 101.3365 105.5968 109.9804 108.6207 104.7228 106.0118 104.0118 102.94 107.6219 121.8342 118.0423 122.2586
secondary (% gross) 31.8468 48.04706 60.04706 73.48619 85.90683 91.6662 93.89221 89.05884 45.0041 57.57103 68.905185 72.91143 85.83446 87.64275 89.48275 76.06258 86.4097 110.453 93.25074 12.14547 43.96275 66.56304 72.69195 74.42249
tertiary (% gross) 1.41838 3.00913 6.474124923 11.42145 28.28053 39.41155 44.30046 39.93893 0 0 0 0 31.62423 30.32905 31.64919 28.7532 22.565405 17.80551 11.3693 12.14547 12.00074 15.0151 24.20384 25.63541
X轴必须具有年值(2000,2004,2008,2012),y轴具有注册类型... 对于每个国家/地区,我想要单独的图表,,,,"评论中的图表链接"
代码不是真的,但这是我的开始:
library(lattice)
xtest<-read.csv(file.choose(), header=T, sep=",")
ytest<-read.csv(file.choose(), header=F, sep=",")
xvalues<-as.matrix(xtest)
yvalues<-as.matrix(ytest)
corvalue<-cor(xvalues,yvalues)
image(x=seq(dim(xvalues)[2]), y=seq(dim(yvalues)[2]), z=corvalue, xlab="x column", ylab="y column")
text(expand.grid(x=seq(dim(xvalues)[2]), y=seq(dim(yvalues)[2])), labels=round(c(corvalue),2))
作为测试,我采用gdp原始数据集的一个子集xtest:
Comoros Comoros Comoros Comoros
201899884 201899884 201899884 201899884
362420484 362420484 362420484 362420484
4880000000 4880000000 4880000000 4880000000
6800000000 6800000000 6800000000 6800000000
对于scoolenrollment,我收集数据的子集,ytest:
0 2.39124 4.3563 23.68581 24.80515401
99.78652 116.7763 121.0558 112.08 117.3767
0 31.8468 48.04706 60.04706 73.48619
0.82459 1.41838 3.00913 6.474124923 11.42145
有关提高产量的建议吗? 注释中的输出结果: