首先,感谢阅读并可能对此作出回应。
现在,问题是: 我在python 2.7上,当我尝试使用fastgreedy算法在我的图中找到社区时,我收到此错误:
---------------------------------------------------------------------------
InternalError Traceback (most recent call last)
<ipython-input-180-3b8456851658> in <module>()
----> 1 dendrogram = g_summary.community_fastgreedy(weights=edge_frequency.values())
/usr/local/lib/python2.7/site-packages/igraph/__init__.pyc in community_fastgreedy(self, weights)
959 in very large networks. Phys Rev E 70, 066111 (2004).
960 """
--> 961 merges, qs = GraphBase.community_fastgreedy(self, weights)
962
963 # qs may be shorter than |V|-1 if we are left with a few separated
InternalError: Error at fast_community.c:553: fast-greedy community finding works only on graphs without multiple edges, Invalid value
这就是我创建图表的方式:
import igraph as ig
vertices = words #about 600 words from a number of news articles: ['palestine', 'israel', 'hamas, 'nasa', 'mercury', 'water', ...]
gen = ig.UniqueIdGenerator()
[gen[word] for word in vertices] #generate word-to-integer mapping as each edge has to be between integer ids (words)
edges = []
for ind in xrange(articles.shape[0]): # articles is a pandas dataframe; each row corresponds to an article; one column is 'top_words' which includes the top few words of each article. The above list *words* is the unique union set of top_words for all articles.
words_i = articles['top_words'].values[ind] # for one article, this looks like ['palestine','israel','hamas']
edges.extend([(gen[x[0]],gen[x[1]]) for x in combinations(words_i,2)]) #basically there is an edge for each pair of top_words in a given article. For the example article above, we get edges between israel-palestine, israel-hamas, palestine-hamas.
unique_edges = list(set(edges))
unique_edge_frequency = {}
for e in unique_edges:
unique_edge_frequency[e] = edges.count(e)
g = ig.Graph(vertex_attrs={"label": vertices}, edges=unique_edges, directed=False)
g.es['width'] = np.asarray([unique_edge_frequency[e] for e in unique_edge_frequency.keys()])*1.0/max(unique_edge_frequency.values())
这就是引发错误的原因:
dendrogram = g.community_fastgreedy(weights=g.es['width'])
我做错了什么?
答案 0 :(得分:3)
您的图表包含多条边(即同一对节点之间有多条边)。快速贪婪的社区检测无法在这些图表上工作;您必须使用g.simplify()
将多个边折叠为单个边。
您似乎也在尝试根据同一对顶点之间有多少条边来设置边缘的"width"
属性。您可以简单地执行此操作,而不是构建unique_edges
然后构建unique_edge_frequency
:
g = Graph(edges, directed=False)
g.es["width"] = 1
g.simplify(combine_edges={ "width": "sum" })
这将首先创建一个包含多个边的图形,然后为每个边指定宽度1,最后将多个边折叠成单个边,同时总结它们的宽度。