Question

我不确定这是否是提出问题的正确位置（我是R和本网站的新手）。我的问题如下：我如何计算经度和纬度点之间的距离？

我在这个网站上搜索了我的问题的答案，但答案只考虑了2分（而我的数据集包含超过207000行）。

我有一个名为'adsb_relevant_columns_correct_timedifference'的数据框，其中包含以下列：Callsign，Altitude，Speed，Direction，Date_Time，Latitude，Longitude。

Callsign Altitude  Speed Direction   Date_Time             Latitude     Longitude
A118       18000    110  340         2017-11-06 22:28:09   70.6086      58.2959
A118       18500    120  339         2017-11-06 22:29:09   72.1508      58.7894
B222       18500    150  350         2017-11-08 07:28:09   71.1689      59.1234
D123       19000    150  110         2018-05-29 15:13:27   69.4523      68.1235

我想计算每次测量之间的距离（以米为单位）（每行是一个新的测量值），并将其添加为名为“距离”的新列。第一次距离计算应该在第二行，因为以后的目的。因此，“距离”列的第一行可以是零或NA，这无关紧要。

所以，我想知道第一次测量（Lat 70.6086和Long 58.2959）与第二次测量（Lat 72.1508和58.7894）之间的距离。然后是第二次和第三次测量之间的距离。然后在第三和第四之间，依此类推，进行超过207000次测量。

预期的输出应该是这样的：

Callsign Altitude  Speed Direction   Date_Time             Latitude     Longitude  Distance 
A118       18000    110  340         2017-11-06 22:28:09   70.6086      58.2959    NA  
A118       18500    120  339         2017-11-06 22:29:09   72.1508      58.7894    172000
B222       18500    150  350         2017-11-08 07:28:09   71.1689      59.1234    110000
D123       19000    150  110         2018-05-29 15:13:27   69.4523      68.1235    387000

我在R中找到了distm函数，我可以手动完成两次测量，而不是完整的数据集。

distm(p1, p2, fun = distHaversine)

我尝试了以下代码

adsb_relevant_columns_correct_timedifference <- mutate(adsb_relevant_columns_correct_timedifference, Distance =
distm(c(adsb_relevant_columns_correct_timedifference$Longitude, adsb_relevant_columns_correct_timedifference$Latitude),
      c(lag(adsb_relevant_columns_correct_timedifference$Longitude, adsb_relevant_columns_correct_timedifference$Latitude)), fun = distCosine))

然而，我收到以下错误

mutate_impl（.data，dots）中的错误：评估错误：向量的长度错误，应为2.

对不起我的长篇解释，但我希望我的问题很明确。有人可以告诉我如何计算几次测量之间的距离，并将其作为新列添加到我的数据框中吗？

Answer 1

您可以使用distm - 函数代替distHaversine。在mutate调用中，您不应重复数据框并使用$运算符，mutate已经知道在哪里查找列。发生错误是因为您需要使用cbind而不是c，因为c创建一个长向量，只需将列堆叠在一起，而cbind创建一个包含两列的数据框（在这种情况下你想拥有什么）。

library(geosphere)
library(dplyr)

mutate(mydata, 
       Distance = distHaversine(cbind(Longitude, Latitude),
                                cbind(lag(Longitude), lag(Latitude))))

#   Callsign Altitude Speed Direction           Date_Time Latitude Longitude Distance
# 1     A118    18000   110       340 2017-11-06T22:28:09  70.6086   58.2959       NA
# 2     A118    18500   120       339 2017-11-06T22:29:09  72.1508   58.7894 172569.2
# 3     B222    18500   150       350 2017-11-08T07:28:09  71.1689   59.1234 109928.5
# 4     D123    19000   150       110 2018-05-29T15:13:27  69.4523   68.1235 387356.2

使用distCosine会有点棘手，因为如果其中一个输入纬度或经度丢失，它就不会返回NA。因此我稍微修改了函数，这解决了问题：

modified_distCosine <- function(Longitude1, Latitude1, Longitude2, Latitude2) {
  if (any(is.na(c(Longitude1, Latitude1, Longitude2, Latitude2)))) {
    NA
  } else {
    distCosine(c(Longitude1, Latitude1), c(Longitude2, Latitude2))
  }
}

mutate(mydata, 
       Distance = mapply(modified_distCosine, 
                         Longitude, Latitude, lag(Longitude), lag(Latitude)))

#   Callsign Altitude Speed Direction           Date_Time Latitude Longitude Distance
# 1     A118    18000   110       340 2017-11-06T22:28:09  70.6086   58.2959       NA
# 2     A118    18500   120       339 2017-11-06T22:29:09  72.1508   58.7894 172569.2
# 3     B222    18500   150       350 2017-11-08T07:28:09  71.1689   59.1234 109928.5
# 4     D123    19000   150       110 2018-05-29T15:13:27  69.4523   68.1235 387356.2

在这里，我使用mapply将修改后的函数与参数Longitude, Latitude, lag(Longitude), lag(Latitude)一起应用我很确定必须有更优雅的方式，但至少这是有效的。

数据

mydata <- structure(list(Callsign = c("A118", "A118", "B222", "D123"), Altitude = c(18000L, 18500L, 18500L, 19000L), Speed = c(110L, 120L, 150L, 150L), Direction = c(340L, 339L, 350L, 110L), Date_Time = c("2017-11-06T22:28:09", "2017-11-06T22:29:09", "2017-11-08T07:28:09", "2018-05-29T15:13:27"), Latitude = c(70.6086, 72.1508, 71.1689, 69.4523), Longitude = c(58.2959, 58.7894, 59.1234, 68.1235)), .Names = c("Callsign", "Altitude", "Speed", "Direction", "Date_Time", "Latitude", "Longitude"), class = "data.frame", row.names = c(NA, -4L))

Answer 2

使用$('#client_area').bind('input', function() { console.log("Submitted"); $("#submit2").click(); });也是一种选择。然而，它会产生一个距离矩阵：

distm

然后你可以做

library(geosphere)
p <- cbind(df$Longitude, df$Latitude)
distm(head(p, -1), tail(p, -1), fun = distHaversine)
#          [,1]     [,2]     [,3]
# [1,] 172569.2  69279.8 394651.3
# [2,]      0.0 109928.5 454096.2
# [3,] 109928.5      0.0 387356.2

您也可以直接使用diag(distm(head(p, -1), tail(p, -1), fun = distHaversine)) # [1] 172569.2 109928.5 387356.2函数并避免计算整个矩阵：

distHaversine

这样

distHaversine(head(p, -1), tail(p, -1))
# [1] 172569.2 109928.5 387356.2

Answer 3

首先我们加载数据和库

{
    "analysis": {
        "analyzer": {
            "lowercaseAnalyzer": {
                "type": "custom",
                "tokenizer": "keyword",
                "filter": ["lowercase"]
            }
        }
    }
}

接下来，我们创建一个新列，其中两点之间的距离从第2行开始。

library(geosphere)
df <- data.frame(read.table(text ="
Callsign Altitude  Speed Direction   Date_Time             Latitude     Longitude
A118       18000    110  340         2017-11-06T22:28:09   70.6086      58.2959
A118       18500    120  339         2017-11-06T22:29:09   72.1508      58.7894
B222       18500    150  350         2017-11-08T07:28:09   71.1689      59.1234
D123       19000    150  110         2018-05-29T15:13:27   69.4523      68.1235"
, header=TRUE))

这样做会导致

df$distance[2:nrow(df)] <- sapply(2:nrow(df), 
    function(x) distm(df[x-1,c('Longitude', 'Latitude')], df[x,c('Longitude', 'Latitude')], fun = distHaversine))

Answer 4

您可以使用sf包计算st_distance()的距离。为了使一行与下一行之间的距离你可以做类似的事情。（我假设data是您的data.frame。

library('sf')    

# some points
poi <- data.frame(id = 1:4, 
                  long = c(103.864325, 103.874916, 103.989693, 103.789615), 
                  lat = c(1.281949, 1.305004, 1.359878, 1.404454),
                  name = c('Gardens by the Bay', 'National Stadium', 'Airport', 'Zoo'))

# coerce to sf object
poi <- st_as_sf(x = poi, coords = c('long', 'lat'))

# duplicate object with shifted rows
poi2 <- poi[c(1, 1:(nrow(poi) - 1)), ]

# compute pairwise distance
distance <- st_distance(x = poi, y = poi2)

# extract only the diagonal corresponding to the distance between row r and row r+1
distance <- diag(distance)

# add result to data
poi$distance <- distance

# first distance to NA
poi$distance[1] <- NA

计算数据帧R中多个距离经度纬度

4 个答案: