Question

需要找到输入文件中每种蛋白质的程度，如下所示

A   B
a   b
c   d
a   c
c   b

我使用networkx来获取节点。如何在创建的节点上使用输入文件创建边缘？

代码：

import pandas as pd
df = pd.read_csv('protein.txt',sep='\t', index_col =0)
df = df.reset_index()
df.columns = ['a', 'b']

distinct = pd.concat([df['a'], df['b']]).unique()

import networkx as nx
G=nx.Graph()

nodes= []
for i in distinct:
    node=G.add_node(1)
    nodes.append(node)

Answer 1

从networkx documentation开始，在循环中使用add_edge或首先收集边缘，然后使用add_edges_from：

>>> G = nx.Graph()   # or DiGraph, MultiGraph, MultiDiGraph, etc
>>> e = (1,2)
>>> G.add_edge(1, 2)           # explicit two-node form
>>> G.add_edge(*e)             # single edge as tuple of two nodes
>>> G.add_edges_from( [(1,2)] ) # add edges from iterable container

然后G.degree()为您提供节点的程度。

Answer 2

首先，函数read_csv被错误地用于读取输入文件。列由空格而不是制表符分隔，因此sep应为'\s+'而不是'\t'。此外，输入文件中没有索引列，因此参数index_col不应设置为0。

将输入文件正确读入DataFrame后，我们可以使用函数from_pandas_edgelist将其转换为networkx图。

import networkx as nx
import pandas as pd

df = pd.read_csv('protein.txt', sep='\s+')
g = nx.from_pandas_edgelist(df, 'A', 'B')

如何为节点创建边？

2 个答案: