我正在尝试使用R将多对唯一ID链接在一起。鉴于下面的示例,我有两个ID(此处为ID1和ID2)表示链接。我正在尝试创建链接的行组。在这个例子中,A链接到B,链接到D,链接到E.因为这些都是连接的,我想把它们组合在一起。接下来,还有X链接到Y和Z.因为这两个也连接,我也想将它们分配给一个组。我怎样才能用R?
解决这个问题谢谢!
示例数据:
try {
PrintWriter writer = new PrintWriter("C:\\myFile.txt", "UTF-8");
writer.print(script.toString());
writer.close();
} catch (FileNotFoundException | UnsupportedEncodingException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
DPUT R表示
ID1 ID2
A B
B D
D E
X Y
X Z
需要输出:
structure(list(id1 = structure(c(1L, 2L, 3L, 4L, 4L), .Label = c("A", "B", "D", "X"), class = "factor"), id2 = structure(1:5,.Label = c("B", "D", "E", "Y", "Z"), class = "factor")), .Names = c("id1", "id2"), row.names = c(NA, -5L), class = "data.frame")
答案 0 :(得分:10)
正如@Frank在评论中提到的那样,您可以使用igraph
:
library(igraph)
idf <- graph.data.frame(df)
clusters(idf)$membership
给出了:
A B D X E Y Z
1 1 1 2 1 2 2
是否要将结果分配回df
行:
merge(df, stack(clusters(idf)$membership), by.x = "id1", by.y = "ind", all.x = TRUE)