如何根据另一列的边界框创建平均值?

时间:2015-07-23 18:44:08

标签: r data.table

我有2个数据帧,有数千个数据点:

byte[] bytes = BitConverter.GetBytes( ((3*x + 2&y + z) << 3 ) & 5L ) ;

我想:

  • 读取Dataframe 1中的每个Lat-Long组合
  • 创建一个方形边框
  • 搜索Dataframe 2
  • 中该边界框中的所有Lat-Long组合
  • 在该边界框中找到v2的平均值,并将其添加到Dataframe 1中的相应行

我已经写了一篇文章(latval,lonval):

lat    lon   v1
41.57 -88.11 11
41.58 -88.12 12
42.57 -89.11 55
41.55 -88.31 12

lat    lon   v2
41.57 -88.41 77
41.58 -88.12 56
42.57 -89.11 73
41.55 -88.61 14

如何插入使用DF1的lat-long组合的逻辑并将其插回?我不知道如何将行作为函数参数传递。

2 个答案:

答案 0 :(得分:0)

我认为您需要做的就是

df1$meanval <- mapply(FUN = spatialmean, latval = df1$lat, 
                      lonval = df1$lon, distance = 5000)

但是,我认为这不会比for循环更快。如果速度是关键,我会在您的问题中添加标记data.table,因为几乎可以肯定有更快的方法在data.table中执行此操作,但我并不熟练使用它来向您展示溶液

答案 1 :(得分:0)

如果你可以使用实际距离而不是顶部重方块,我会使用geosphere包中的另一个函数来查找距离。您的方块将具有重叠区域,并且在您的平均计算中包括大于5000的距离。

# sample data with extra row to understand distances row/column
df1 <- data.frame(lat = c(41.57,41.58,42.57,41.55,41.55),
    lon = c(-88.11,-88.12,-89.11,-88.31,-88.31),
    v1 = c(11,12,55,12,12))

df2 <- data.frame(lat = c(41.57,41.58,42.57,41.55),
    lon = c(-88.41,-88.12,-88.11,-88.61),
    v2 = c(77,56,73,14))

# set max distance
maxdist <- 5000

# calculate all distances and check if within limit
distances <- distm(x = df1[ ,c('lon','lat')],y = df2[ ,c('lon','lat')])
withindistance <- distances < maxdist

# grab all v2 based on df1 and calculate the mean. returns NaN if no points within distance
df1$df2mean <- apply(withindistance,1,function(x,funv2){
  mean(funv2[x])
},funv2 = df2$v2)

# or the apply like most would write it. either apply works
df1$df2mean <- apply(withindistance,1,function(x){
  mean(df2$v2[x])
})