我正在尝试在R中逐行应用the dist() function,但我得到的结果就好像它根本没有分组,它只是将dist()应用于我的所有数据帧。
df2 %>% dplyr::group_by(X1) %>% dist()
df2
是我的数据框,我现在只是为了简单而应用于头部。基本上,每组包含坐标(A,B),我试图获得每个点之间的距离。
这是我的数据框:
X1 A B
1 1 12 0.0
2 1 18 0.0
3 1 18 1.0
4 1 13 0.0
5 1 18 4.0
6 1 18 0.0
7 1 18 5.0
8 1 18 0.0
9 1 18 0.0
10 2 73 -2.0
11 2 73 -0.5
12 2 74 -0.5
13 2 73 0.0
14 2 71 -1.0
15 2 75 0.0
答案 0 :(得分:2)
Here's an example of creating distance matrices of the iris data set by species
results = list()
for(spec in unique(iris$Species)){
temp = iris[iris$Species==spec, 1:4]
results[[length(results)+1]] = dist(temp)
}
names(results) = unique(iris$Species)
You'll have to figure out what to do with it afterwords.
答案 1 :(得分:1)
我们可以使用purrr::map
:
library(purrr)
df %>%
split(.$X1) %>%
map(~{
dist(.x)
}) -> distList
distList
#> $`1`
#> 1 2 3 4 5 6 7 8
#> 2 6.000000
#> 3 6.082763 1.000000
#> 4 1.000000 5.000000 5.099020
#> 5 7.211103 4.000000 3.000000 6.403124
#> 6 6.000000 0.000000 1.000000 5.000000 4.000000
#> 7 7.810250 5.000000 4.000000 7.071068 1.000000 5.000000
#> 8 6.000000 0.000000 1.000000 5.000000 4.000000 0.000000 5.000000
#> 9 6.000000 0.000000 1.000000 5.000000 4.000000 0.000000 5.000000 0.000000
#>
#> $`2`
#> 10 11 12 13 14
#> 11 1.500000
#> 12 1.802776 1.000000
#> 13 2.000000 0.500000 1.118034
#> 14 2.236068 2.061553 3.041381 2.236068
#> 15 2.828427 2.061553 1.118034 2.000000 4.123106
df <- read.table(text = 'X1 A B
1 1 12 0.0
2 1 18 0.0
3 1 18 1.0
4 1 13 0.0
5 1 18 4.0
6 1 18 0.0
7 1 18 5.0
8 1 18 0.0
9 1 18 0.0
10 2 73 -2.0
11 2 73 -0.5
12 2 74 -0.5
13 2 73 0.0
14 2 71 -1.0
15 2 75 0.0', h = T)
答案 2 :(得分:1)
这是我的代码和解决方案
require(dplyr)
df2 <- structure(list(X1 = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 2L), A = c(12L, 18L, 18L, 13L, 18L, 18L, 18L,
18L, 18L, 73L, 73L, 74L, 73L, 71L, 75L), B = c(0, 0, 1, 0, 4,
0, 5, 0, 0, -2, -0.5, -0.5, 0, -1, 0)), .Names = c("X1", "A",
"B"), class = "data.frame", row.names = c("1", "2", "3", "4",
"5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15"))
mydf <- df2 %>% group_by(X1) %>% summarise(distmatrix=list(dist(cbind(A,B))))
mydf
# # A tibble: 2 × 2
# X1 distmatrix
# <int> <list>
# 1 1 <S3: dist>
# 2 2 <S3: dist>
mydf$distmatrix
# [[1]]
# 1 2 3 4 5 6 7 8
# 2 6.000000
# 3 6.082763 1.000000
# 4 1.000000 5.000000 5.099020
# 5 7.211103 4.000000 3.000000 6.403124
# 6 6.000000 0.000000 1.000000 5.000000 4.000000
# 7 7.810250 5.000000 4.000000 7.071068 1.000000 5.000000
# 8 6.000000 0.000000 1.000000 5.000000 4.000000 0.000000 5.000000
# 9 6.000000 0.000000 1.000000 5.000000 4.000000 0.000000 5.000000 0.000000
#
# [[2]]
# 1 2 3 4 5
# 2 1.500000
# 3 1.802776 1.000000
# 4 2.000000 0.500000 1.118034
# 5 2.236068 2.061553 3.041381 2.236068
# 6 2.828427 2.061553 1.118034 2.000000 4.123106