删除数据框中的镜像线

时间:2015-02-16 16:17:25

标签: r dataframe igraph

我是初学者,这个问题似乎很天真,但我尝试根据人口中的家庭关系创建一个网络。我正在使用R package igraph。

准备我的数据,我结束了这种数据帧

Source    Target    Distance
Actr22510 Actr22509        1
Actr22511 Actr22509        1
Actr22509 Actr22510        1
Actr22511 Actr22510        1
Actr57033 Actr22510        1
Actr22509 Actr22511        1

我正在尝试构建的网络是非定向的。在这种情况下,线Actr22510-Actr22509和Actr22509-Actr22510是相同的。我不需要它们出现在我的数据框中。

是否可以删除这些镜像线?

非常感谢。

2 个答案:

答案 0 :(得分:3)

如果最终目标是创建一个无向 igraph 对象,则可能根本不需要删除这些行。简单地:

library(igraph)

# Create an undirected graph, with edges between "Source" and "Target"
# Distance is kept as an edge attribute.
g <- graph.data.frame(df, directed=FALSE)

# Remove multiple edges (originally created from "mirror" lines)
g <- simplify(g, remove.multiple=TRUE, remove.loops=FALSE, edge.attr.comb="first")

答案 1 :(得分:2)

一个选项是对每行的前两列进行排序,然后连接,然后检查这些键是否重复:

    df <-structure(list(Source = c("Actr22510", "Actr22511", "Actr22509", "Actr22511", "Actr57033", "Actr22509"), 
                    Target = c("Actr22509", "Actr22509", "Actr22510", "Actr22510", "Actr22510", "Actr22511"), 
                    Distance = c(1L, 1L, 1L, 1L, 1L, 1L)), 
                    .Names = c("Source","Target", "Distance"), class = "data.frame", row.names = c(NA,-6L))
df$key <- apply(df[,1:2],1,FUN=function(x)paste(sort(x),collapse=" "))
df[!duplicated(df$key),]
#Source    Target Distance                 key
#1 Actr22510 Actr22509        1 Actr22509 Actr22510
#2 Actr22511 Actr22509        1 Actr22509 Actr22511
#4 Actr22511 Actr22510        1 Actr22510 Actr22511
#5 Actr57033 Actr22510        1 Actr22510 Actr57033

由于您不想使用apply功能,因此可能更容易理解:

df$key <- ifelse(df$Source < df$Target,  paste(df$Source,df$Target), paste(df$Target,df$Source)

df[!duplicated(df$key),]