Question

我正在研究DBLP数据集（包含超过180万份出版物的元数据，由数千种期刊或会议论文系列中的100多万作者撰写），其中包含以下专栏：

['id', 'title', 'authors', 'year', 'pub_venue', 'ref_id', 'ref_num', 'abstract']

将此数据转换为具有id和值的字典的字典，其中包含来自ref_id列的id列表，我在python igraph中创建了一个图形。

b = {}
for row in list(res):
key = row['id']
if row['ref_id'] is not u'' or None:
    val = map(int, row['ref_id'].strip().split(";"))
    b[key]  = val
else:
    b[key]  = row['ref_id']

graph = Graph( edges = [(v,e) for v in b.keys() for e in b[v]])

此图包含数百万个顶点，并在运行社区检测算法之前可视化数据，我使用igraph中可用的绘图功能 -

layout = graph.layout("drl")
lot(graph ,layout = layout)

但是在8gb Ram的系统上出现内存错误失败了。有没有更好的方法在python中实现相同的目标？

在Python igraph中绘制具有百万个顶点的大图

0 个答案: