Question

我正在使用网络数据，并且在R的count.multiple包中遇到igraph的奇怪（或至少我没想到）行为。

library(igraph)
library(plyr)

df <- data.frame( sender = c( "a", "a", "a", "b", "b", "c","c","d" ),
              receiver = c( "b", "b", "b", "c", "a", "d", "d", "a" ) )

我想要的是计算所有边缘并使用倍数作为权重。

当我ddply(df, .(sender, receiver), "nrow")时，我的结果是：

  sender receiver nrow
1      a        b    3
2      b        a    1
3      b        c    1
4      c        d    2
5      d        a    1

这就是我所期望的。

但是，我不能使用igraph的count.multiple重现这个，这是我在igraph中所期望的那样

df.graph <- graph.edgelist(as.matrix(df))
E(df.graph)$weight <- count.multiple(df.graph)

E(df.graph)$weight产生：

3 3 3 1 1 2 2 1

然后我使用了simplify命令：

df.graph <- simplify(df.graph)

产生

9 1 1 4 1

我得到了这里发生的事情，简化只是添加权重，但我不明白为什么/什么时候使用它而不是ddply正在做什么..？

有什么想法吗？

谢谢！

Answer 1

simplify的默认行为是添加多个边的权重。

为避免重复计算，您可以将初始权重设置为1

g <- graph.edgelist(as.matrix(df))
E(g)$weight <- 1
g <- simplify( g )
E(g)$weight

或更改汇总方式。

g <- graph.edgelist(as.matrix(df))
E(g)$weight <- count.multiple(g)
g <- simplify( g, edge.attr.comb = list(weight=max, name="concat", "ignore") )
E(g)$weight

麻烦理解count.multiple和简化igraph

1 个答案: