我有三个数据框如下。我想根据Lon& amp;将它们组合成一个数据帧。 Lat,并平均每个'单元'的3个值。我已经读过这个(calculate average over multiple data frames)并试图利用聚合但无济于事....任何指针都赞赏。
> head(CSR.GRACE[,c(1:14)],10)
Lon Lat January February March April May June July August September October November December
1 28.5 -4.5 17.710425 13.855327 12.385712 13.558101 12.789865 6.913783 1.03770075 -5.3901741 -6.6351015 -7.661375 -3.09337944 6.0659410
2 29.5 -4.5 14.010154 10.257435 9.009641 10.275778 9.598241 5.166972 0.73570247 -4.2733162 -5.0861417 -5.850192 -2.93521806 4.1240150
3 30.5 -4.5 16.288443 10.467614 9.275714 10.904162 10.228808 5.364853 0.50089883 -4.7478741 -5.4320069 -6.316568 -3.80160315 3.8494745
4 31.5 -4.5 18.560677 9.932461 9.239592 11.037748 10.551886 5.281853 0.01181973 -4.9034324 -5.3504391 -6.438050 -4.41695714 3.3432301
5 32.5 -4.5 10.171202 4.476512 4.509140 5.448872 5.338991 2.556262 -0.22646611 -2.3274204 -2.4376636 -3.103697 -2.27586145 1.3641930
6 33.5 -4.5 14.040068 5.349344 5.772618 7.158792 7.121341 3.407587 -0.30616689 -2.6800099 -2.7955420 -3.803622 -2.77898997 1.4021380
> head(GFZ.GRACE[,c(1:14)],10)
Lon Lat January February March April May June July August September October November December
1 28.5 -4.5 15.642782 15.521720 11.823875 19.825865 17.335761 11.208188 5.080615 -3.0897644 -5.733351 -4.196604 -1.6697661 10.744696
2 29.5 -4.5 12.164074 10.931418 8.622238 15.341911 12.969769 8.521280 4.072790 -2.4301791 -4.551170 -3.055914 -1.2260079 7.592880
3 30.5 -4.5 13.579305 10.267520 8.787406 16.567715 13.745143 9.121496 4.497849 -2.6723491 -5.022949 -3.269881 -1.0691039 7.377143
4 31.5 -4.5 14.501465 8.600480 8.259757 16.981533 14.054429 9.318550 4.582672 -2.7917893 -5.249895 -3.636936 -0.5141342 6.770836
5 32.5 -4.5 7.311216 3.249596 3.513870 8.430777 6.941659 4.572560 2.203461 -1.4106516 -2.661226 -2.113089 0.2459282 3.049897
6 33.5 -4.5 9.121348 3.113245 3.584976 11.040761 8.732950 5.772059 2.811168 -1.8554437 -3.524447 -3.272863 1.2493973 3.750694
> head(JPL.GRACE[,c(1:14)],10)
Lon Lat January February March April May June July August September October November December
1 28.5 -4.5 19.559790 14.544438 12.035112 13.944141 11.931011 7.513007 3.095003 -3.6165702 -6.5945043 -7.2498567 -4.5402436 6.3935236
2 29.5 -4.5 15.740160 11.192191 8.549782 10.783359 9.401173 5.834498 2.267822 -2.6354346 -4.8939197 -5.5912996 -3.7295148 4.1461123
3 30.5 -4.5 18.984714 12.014807 8.510139 11.628697 10.635699 6.448064 2.260429 -2.6979695 -5.2102337 -6.2646164 -4.2713238 3.5089825
4 31.5 -4.5 22.794356 11.993054 8.162500 11.813746 11.747350 6.955983 2.164615 -2.5707902 -5.3448873 -6.7473006 -4.5777496 2.5609555
5 32.5 -4.5 13.233634 5.606305 3.880347 5.753024 6.388978 3.742596 1.096214 -1.1103189 -2.6367831 -3.4102675 -2.2860237 0.7826054
6 33.5 -4.5 19.260989 6.761722 4.978247 7.373498 9.135645 5.421030 1.706414 -1.0796434 -3.3122886 -4.2114588 -2.8110246 0.4825075
答案 0 :(得分:9)
你可以这样做:
library(data.table)
rbindlist(list(JPL.GRACE,GFZ.GRACE,CSR.GRACE))[,lapply(.SD,mean), list(Lon, Lat)]
<强>说明强>:
data.frames
list
被置于rbindlist
和&#39;叠加的地平线&#39;使用data.table
(返回data.frame
)。我们这样做是因为您的do.call(rbind, list(JPL.GRACE,GFZ.GRACE,CSR.GRACE))
具有相同的结构(相同的列数和名称,相同类型的数据)。
另一种方法是Lon, Lat
。
然后我们遍历每个不同的.SD
对。 data.table
表示与每对关联的dt = rbindlist(list(JPL.GRACE,GFZ.GRACE,CSR.GRACE))
dt[,print(.SD), list(Lon, Lat)]
。你可以这样做:
.SD
对于这些s_dict
中的每一个,我们只需循环遍历列并计算均值。
答案 1 :(得分:2)
由于您的数据全部为数字,因此您可以将其放入3d数组并使用rowMeans
library(abind)
arr = abind(JPL.GRACE, GFZ.GRACE, CSR.GRACE, along = 3)
rowMeans(arr, dims = 2)
答案 2 :(得分:2)
使用1:2
作为&#34; MARGIN&#34;
library(abind)
temp_array <- abind(CSR.GRACE, GFZ.GRACE, JPL.GRACE, along=3)
res <- apply(temp_array, 1:2, mean)
这是一个简单的例子:
install.packages('abind')
x <- matrix(1:12,3,4)
y <- x+100; z= y-50
apply( abind::abind(x,y,z, along=3), 1:2, mean)
[,1] [,2] [,3] [,4]
[1,] 51 54 57 60
[2,] 52 55 58 61
[3,] 53 56 59 62
答案 3 :(得分:1)
也许sqldf
package可以帮到你:
library(sqldf)
df1 <- CSR.GRACE[,c(1:14)]
df2 <- GFZ.GRACE[,c(1:14)]
df3 <- JPL.GRACE[,c(1:14)]
# This could be done with rbind(), but I'll use sqldf()
# I'm assuming all data frames have the same columns:
df <- sqldf('select * from df1
union all select * from df2
union all select * from df3')
# The average can be done also with sqldf (just a demo)
sqldf('select Lon, Lat, avg(January) as jan, avg(February) as feb
from df
group by Lon, Lat')
可能有更好的解决方案,但这是一种简单的方法。
希望这有帮助
答案 4 :(得分:0)
这是一个基本的 R
解决方案:
r <- do.call(rbind, list(JPL.GRACE,GFZ.GRACE,CSR.GRACE))
r <- aggregate(r[-c(1,2)], list(r$Lon, r$Lat), mean)