用R中的嵌套循环计算

时间:2015-04-24 19:23:03

标签: r nested-loops

我有一个包含三列的地理编码数据集:纬度,经度和群集。我计算了集群的平均中心,并将结果存储在两个列表Center_lat和Center_lon中。

现在我想用Haversine公式计算每个观察点(3000+)到每个聚类中心(30)的距离。获得3000乘30的矩阵。

我尝试使用嵌套for循环,但所有集群都有相同的距离。这是代码。

for (i in 1:n){
    for (k in 1:c){
    lat1=radians(Geocode[i,1])
    lon1=radians(Geocode[i,2])
    lat2=radians(Center_lat[k,2])
    lon2=radians(Center_lon[k,2])
    }
    R <- 3958.756 # Earth mean radius [miles]
    dist_mat[i,] <- acos(sin(lat1)*sin(lat2) + cos(lat1)*cos(lat2) * cos(lon2-lon1)) * R
  }

我也在考虑使用lapply来替换嵌套循环。但我不确定如何使用该功能...感谢任何帮助。

# Convert to radian
radians = function(theta=0){return(theta * pi / 180)}

# Calculates the geodesic distance from each property to the center of it's current cluster using the
# Spherical Law of Cosines (slc)
get_dist <- function(lat1, lon1, lat2, lon2) {
  R <- 3958.756 # Earth mean radius [miles]
  d <- acos(sin(radians(lat1))*sin(radians(lat2)) + 
              cos(radians(lat1))*cos(radians(lat2)) * cos(radians(lon2)-radians(lon1))) * R
  return(d) # Distance in miles
}

dist_mat<-lapply()

2 个答案:

答案 0 :(得分:1)

This is the type of computation you want to vectorize in R. Here we use outer to generate all the possible combinations of row indices from your Geocode data and Center_x data, and then apply the distance function in one fell swoop.

First, get data in easier to use form (one matrix for locations, another for centers, first column lats, second lons):

# See Data section below for actual data used

# G <- radians(Geocode)
# C <- radians(cbind(Center_lat[, 2], Center_lon[, 2]))    
R <- 3958.756 # Earth mean radius [miles]

Define the function, notice how we use the indices to look up the actual coordinates in G and C, and how the function is vectorized (i.e. we only need to call it once with all the data):

my_dist <- function(xind, yind)
  acos(
    sin(G[xind, 1]) * sin(C[yind, 1]) + 
    cos(G[xind, 1]) * cos(C[yind, 1]) * cos(C[yind, 2] - G[xind, 2])
  ) * R

And apply it with outer:

DISTS <- outer(seq.int(nrow(G)), seq.int(nrow(C)), my_dist)

str(DISTS)
# num [1:3000, 1:30] 4208 6500 8623 7303 3864 ...

quantile(DISTS) # to make sure stuff is reasonable:
#     0%       25%       50%       75%      100% 
#  0.000  4107.574  6204.799  8333.155 12422.059     

This runs in about 30ms on my system.


Data:

set.seed(1)
lats <- runif(10000, -60, 60) * pi / 180
lons <- runif(10000, -179, 180) * pi / 180

G.ind <- sample(10000, 3000)
C.ind <- sample(10000, 30)

G <- cbind(lats[G.ind], lons[G.ind])
C <- cbind(lats[C.ind], lons[C.ind])

答案 1 :(得分:0)

看起来你想要每行和每列写一次矩阵,所以你想要在两个for循环中改变矩阵,如下所示:

{{1}}