按唯一坐标列出的列表

时间:2017-12-24 06:58:25

标签: r list

我有一个类似下面的数据框。我想折叠它,以便每个唯一坐标是其子ID的列表。

       subID                  latlon
1  S20298920 29.2178694, -94.9342990
2  S35629295 26.7063982, -80.7168961
3  S35844314 26.7063982, -80.7168961
4  S35833936 26.6836236, -80.3512144
7  S30634757 42.4585456, -76.5146989
8  S35834082 26.4330582, -80.9416786
9  S35857972 26.4330582, -80.9416786
10 S35833885 26.7063982, -80.7168961

所以,在这里,我希望(26.7063982,-80.7168961)是一个包含(S35629295,S35844314)和(29.2178694,-94.9342990)的列表,只是一个包含(S20298920)的列表。我认为列表清单是最有意义的。

3 个答案:

答案 0 :(得分:1)

使用aggregate

out <- aggregate(data=df,subID~latlon,FUN = function(t) list(sort(paste(t))))

由于您的数据集庞大且繁琐,下面的示例代码使用了更易于阅读的淡化数据。

out <- aggregate(data=df,name~ID,FUN = function(t) list(sort(paste(t))))
out
  ID          name
1  1 apple, orange
2  2        orange
3  3 apple, orange

数据:

df <- data.frame(ID=c(1,1,2,3,3),
                 name=c('apple', 'orange', 'orange', 'orange', 'apple'))

Demo

答案 1 :(得分:0)

   with(data,tapply(subID,latlon,as.list))

输出:

$`26.4330582 -80.9416786`
$`26.4330582 -80.9416786`[[1]]
[1] "S35834082"

$`26.4330582 -80.9416786`[[2]]
[1] "S35857972"


$`26.6836236 -80.3512144`
$`26.6836236 -80.3512144`[[1]]
[1] "S35833936"
   :
   :
   :

数据:

 data=read.table(text="subID latlon
 S20298920 '29.2178694 -94.9342990'
 S35629295 '26.7063982 -80.7168961'
 S35844314 '26.7063982 -80.7168961'
 S35833936 '26.6836236 -80.3512144'
 S30634757 '42.4585456 -76.5146989'
 S35834082 '26.4330582 -80.9416786'
 S35857972 '26.4330582 -80.9416786'
 S35833885 '26.7063982 -80.7168961' ",h=T,stringsAsFactors=F)

答案 2 :(得分:0)

在tidyverse中,您可以使用tidyr::nest来嵌套数据框:

library(tidyverse)

df <- data_frame(subID = c("S20298920", "S35629295", "S35844314", "S35833936", "S30634757", "S35834082", "S35857972", "S35833885"), 
                 latlon = c("29.2178694, -94.934299", "26.7063982, -80.7168961", "26.7063982, -80.7168961", "26.6836236, -80.3512144", "42.4585456, -76.5146989", "26.4330582, -80.9416786", "26.4330582, -80.9416786", "26.7063982, -80.7168961"))

df %>% nest(subID)
#> # A tibble: 5 x 2
#>                    latlon             data
#>                     <chr>           <list>
#> 1  29.2178694, -94.934299 <tibble [1 x 1]>
#> 2 26.7063982, -80.7168961 <tibble [3 x 1]>
#> 3 26.6836236, -80.3512144 <tibble [1 x 1]>
#> 4 42.4585456, -76.5146989 <tibble [1 x 1]>
#> 5 26.4330582, -80.9416786 <tibble [2 x 1]>

或仅与list汇总以制作向量的列表列:

df %>% 
    group_by(latlon) %>% 
    summarise_all(list)
#> # A tibble: 5 x 2
#>                    latlon     subID
#>                     <chr>    <list>
#> 1 26.4330582, -80.9416786 <chr [2]>
#> 2 26.6836236, -80.3512144 <chr [1]>
#> 3 26.7063982, -80.7168961 <chr [3]>
#> 4  29.2178694, -94.934299 <chr [1]>
#> 5 42.4585456, -76.5146989 <chr [1]>