我正在尝试计算三维点之间的欧几里德距离,并将该距离作为附加列添加。我试图循环遍历行,如下所示:
df1 <- as.data.frame(list('x'=1:5,'y'=(1:5)^2,'z'=6:10))
for (i in 2:nrow(df1)) {
df1$diff <- sqrt((df1$x[i,]-df1$x[i-1,])^2 -
(df1$y[i,]-df1$y[i-1,])^2 -
(df1$z[i,]-df1$z[i-1,])^2)
}
但是我收到了错误:
Error in df1$x[i, ] : incorrect number of dimensions
我哪里错了?
答案 0 :(得分:2)
主要问题是在索引时将x
视为类似数组的对象。即您正在使用x[row, col]
索引编制应使用x[element]
的位置。
在将结果插入其中时,您还需要索引到df1$diff
。你的欧几里德距离方程是错误的;你需要加上平方差,而不是减去它们。
df1 <- data.frame(list(x = 1:5, y = (1:5)^2, z = 6:10))
df1$diff <- NA
for (i in 2:nrow(df1)) {
df1$diff[i] <- with(df1, sqrt((x[i] - x[i-1])^2 +
(y[i] - y[i-1])^2 +
(z[i] - z[i-1])^2))
}
> df1
x y z diff
1 1 1 6 NA
2 2 4 7 3.316625
3 3 9 8 5.196152
4 4 16 9 7.141428
5 5 25 10 9.110434
你不需要这个循环,你可以依赖R进行逐个元素的操作,从而只需一步即可完成:
df1 <- data.frame(list(x = 1:5, y = (1:5)^2, z = 6:10))
df1$diff <- c(NA, sqrt(rowSums((df1[-1, 1:3] - df1[-5, 1:3])^2)))
df1
> df1
x y z diff
1 1 1 6 NA
2 2 4 7 3.316625
3 3 9 8 5.196152
4 4 16 9 7.141428
5 5 25 10 9.110434
如果真正的问题很大,你可能希望用df1
强制转换为矩阵,因为数据帧很慢。
m1 <- as.matrix(df1[, 1:3])
m1 <- cbind(m1, diff = c(NA, sqrt(rowSums((m1[-1, 1:3] - m1[-5, 1:3])^2))))
> m1
x y z diff
[1,] 1 1 6 NA
[2,] 2 4 7 3.316625
[3,] 3 9 8 5.196152
[4,] 4 16 9 7.141428
[5,] 5 25 10 9.110434
您可以使用head()
和tail()
将其包装到一个函数中,这样您就不必担心原始数据有多少行了:
myEuc <- function(x) {
if (isdf <- is.data.frame(x)) {
x <- data.matrix(x)
}
dij <- c(NA, sqrt(rowSums((tail(x, -1) - head(x, -1))^2)))
x <- cbind(x, diff = dij)
if (isdf) {
x <- as.data.frame(x)
}
x
}
df1 <- data.frame(list(x = 1:5, y = (1:5)^2, z = 6:10))
myEuc(df1)
> myEuc(df1)
x y z diff
1 1 6 NA
[2,] 2 4 7 3.316625
[3,] 3 9 8 5.196152
[4,] 4 16 9 7.141428
[5,] 5 25 10 9.110434
答案 1 :(得分:1)
这是另一个选项
sqrt(Reduce('+',lapply(df1, function(x) (x- lag(x, default=x[1]))^2)))
#[1] 0.000000 3.316625 5.196152 7.141428 9.110434
或者
c(0,sqrt(rowSums((sapply(df1, diff))^2)))
#[1] 0.000000 3.316625 5.196152 7.141428 9.110434
答案 2 :(得分:0)
您的新变量的观察次数少于data.frame,因此您需要在向量的顶部或底部添加NA:
df1 <- as.data.frame(list('x'=1:5,'y'=(1:5)^2,'z'=6:10))
myVec <- numeric(nrow(df1))
myVec[1] <- NA
for (i in 2:nrow(df1)) {
myVec[i] <- sqrt((df1[i,"x"]-df1[i-1,"x"])^2 +
(df1[i,"y"]-df1[i-1,"y"])^2 +
(df1[i,"z"]-df1[i-1,"z"])^2)
}
df1$diff <- myVec