在R中创建边缘列表

时间:2015-02-08 11:26:45

标签: r edge-list

我有这样的数据:

ID=c(rep("ID1",3), rep("ID2",2), "ID3", rep("ID4",2))
item=c("a","b","c","a","c","a","b","a")

data.frame(ID,item)

ID1 a
ID1 b
ID1 c
ID2 a
ID2 c
ID3 a
ID4 b
ID4 a

我需要它作为这样的边缘列表:

a;b
b;c
a;c
a;c
b;a

来自ID1的前三个边缘,ID2中的第四个边缘,ID3没有边缘,所以没有边缘,ID4中的第五个边缘。有关如何实现这一目标的任何想法?熔融/铸造?

3 个答案:

答案 0 :(得分:6)

我猜应该有一个简单的igrpah解决方案,但这是一个使用data.table

的简单解决方案
library(data.table)
setDT(df)[, if(.N > 1) combn(as.character(item), 2, paste, collapse = ";"), ID]

#     ID  V1
# 1: ID1 a;b
# 2: ID1 a;c
# 3: ID1 b;c
# 4: ID2 a;c
# 5: ID4 b;a

答案 1 :(得分:3)

尝试

 res <- do.call(rbind,with(df, tapply(item, ID, 
         FUN=function(x) if(length(x)>=2) t(combn(x,2)))))
  paste(res[,1], res[,2], sep=";")
 #[1] "a;b" "a;c" "b;c" "a;c" "b;a"

答案 2 :(得分:2)

这是一个更具可扩展性的解决方案,它使用与其他解决方案相同的核心逻辑:

library(plyr)
library(dplyr)

ID=c(rep("ID1",3), rep("ID2",2), "ID3", rep("ID4",2))
item=c("a","b","c","a","c","a","b","a")

dfPaths = data.frame(ID, item)
dfPaths2 = dfPaths %>% 
  group_by(ID) %>% 
  mutate(numitems = n(), item = as.character(item)) %>%
  filter(numitems > 1)


ddply(dfPaths2, .(ID), function(x) t(combn(x$item, 2)))