Question

我正在做一个与链接预测有关的机器学习项目，但是我却坚持使用networkX读取数据：

我要读取的训练数据存储在具有以下结构的“ train.txt”文件中：

1 2
2 3
4 3 5 1

每行代表一个节点及其邻居，即第3行：节点4与节点3、5和1连接。

我用来读取网络数据的代码是：

G = nx.read_edgelist('train.txt',delimiter = "\t",create_using = nx.DiGraph(),nodetype = int)

但是此代码引发TypeError异常：无法按以下方式转换边缘数据：

TypeError：无法转换边缘数据（['3105725'，'2828522'， '4394015'，'2367409'，'2397416'，...，'759864']）导入字典。

Answer 1

欢迎您！

您的评论是正确的-这不是古典意义上的优势清单。 networkx的边缘列表如下所示：

这是解决问题的一种方法：逐行读取文件，并随行将每个边添加到图形中。

import networkx as nx

D= nx.DiGraph()
with open('train.txt','r') as f:
    for line in f:
        line=line.split('\t')#split the line up into a list - the first entry will be the node, the others his friends
        if len(line)==1:#in case the node has no friends, we should still add him to the network
            if line[0] not in D:
                nx.add_node(line[0])
        else:#in case the node has friends, loop over all the entries in the list
            focal_node = line[0]#pick your node
            for friend in line[1:]:#loop over the friends
                D.add_edge(focal_node,friend)#add each edge to the graph

nx.draw_networkx(D) #for fun draw your network

Answer 2

nx.read_edgelist除了边缘的源和目标之外，还希望每个边缘有任意数据的行，因此这不是您应该使用的情况。
networkx提供了一种使用nx.read_adjlist从文件读取邻接表的方法。
考虑文件graph_adjlist.txt。

1   2   3   4
2   5
3   5
4   5

可以根据以下邻接表创建图形。

import networkx as nx

G = nx.read_adjlist('graph_adjlist.txt', create_using = nx.DiGraph(), nodetype = int)

print(G.nodes(),G.edges())
# [1, 2, 3, 4, 5] [(1, 2), (1, 3), (1, 4), (2, 5), (3, 5), (4, 5)]

Networkx-从文件中读取邻接表

2 个答案: