我正在尝试查找工人移动的总距离,而我的df
看起来像
Name x y
John 12 34
John 15 31
John 8 38
John 20 14
我尝试使用dist(rbind())
函数,但给出的结果不正确。它只是给出sqrt((row1)^2+(row2)^2+(row3)^2+(row4)^2)
的结果,我认为这是不正确的。
因此,我尝试使用for
循环执行此操作,以便分别计算第1行与第2行,第2行与第3行之间的距离,并进行后续汇总。我该怎么做?
我的代码当前如下所示:
for(i in nrow(df)){
n <- dist(rbind(df$x,df$y))
}
这只是给我上述错误的单个结果,而不是每1-2行/秒的单个距离的列表。
我的预期输出将是:
4.2426
9.8995
26.8328
然后我可以通过运行来总结它们:
sum(n)
对吗?
答案 0 :(得分:1)
不需要循环
一种dplyr / tidyverse方法也可以涵盖多个名称(因为“名称”列的存在表示多个工作人员)。
df <- data.frame( Name = c("John","John","John","John"),
x = c(12,15,8,20),
y = c(34,31,38,14),
stringsAsFactors = FALSE )
library(tidyverse)
df %>%
#group by name (just in case there are multiple workers in the DF)
#you can remove this line if there is only 1 worker
group_by( Name ) %>%
#get the previous x and y value
mutate( x_prev = lag( x ), y_prev = lag( y ) ) %>%
#filter out rows without previous x value
filter( !is.na( x_prev ) ) %>%
#calculate the distance
mutate( distance = sqrt( abs (x - x_prev )^2 + abs( y - y_prev )^2 ) ) %>%
#summarise to get the total distance
summarise( total_distance = sum( distance ) )
# # A tibble: 1 x 2
# Name total_distance
# <chr> <dbl>
# 1 John 41.0
#create a matrix of x and y, calculate the distance and create a matrix from the results
M <- as.matrix( dist( matrix( c( df$x, df$y ), ncol = 2 ) ) )
M
# 1 2 3 4
# 1 0.000000 4.242641 5.656854 21.54066
# 2 4.242641 0.000000 9.899495 17.72005
# 3 5.656854 9.899495 0.000000 26.83282
# 4 21.540659 17.720045 26.832816 0.00000
#get the first off diagonal of the matrix (row = column+1)
M[row(M) == col(M) + 1]
#[1] 4.242641 9.899495 26.832816
#sum the first off diagonal
sum( M[row(M) == col(M) + 1] )
#[1] 40.97495
答案 1 :(得分:1)
使用基数R,您可以在每对连续的行对上调用dist
,然后在相邻的距离cumsum
处按名称获取结果。
df <- read.table(text="Name x y
John 12 34
John 15 31
John 8 38
John 20 14
Mark 11 13
Mark 16 18", header=TRUE)
by(df, df$Name, function(mat) {
idx <- seq_len(nrow(mat))
cumsum(mapply(function(i,j) dist(mat[c(i,j), c("x","y")]),
head(idx, -1), tail(idx, -1)))
})
或者,下面仅计算整个距离矩阵并提取第一个非对角线
by(df, df$Name, function(mat) {
idx <- seq_len(nrow(mat))
cumsum(
as.matrix(dist(mat[,c("x","y")]))[cbind(head(idx, -1), tail(idx, -1))])
})
答案 2 :(得分:0)
df<-data.frame("Name" = rep(x = "John",times = 4),"x" = c(12,15,8,20),"y" = c(34,31,38,14))
#> df
# Name x y
#1 John 12 34
#2 John 15 31
#3 John 8 38
#4 John 20 14
n<-numeric()
for(i in 1:(nrow(df) - 1)){
n[i] <- dist(rbind(df[i,-1],df[(i + 1),-1]))
}
print(n)
#[1] 4.242641 9.899495 26.832816
sum(n)
#[1] 40.97495