我有一个包含标题和行的数据集,格式为
login,project_name,ccount
kts,blocklist-ipsets,2192232
Ken,heartbeat,1477010
Code,feature-tests,584334
dat,mirage-ci-logs,584046
vip,committer,533839
RQHVMCAU,mirror,505976
ANHTMOTA,d43-en,378663
mikins,openapsdev,348992
direwolf-github,my-app,340046
其中login,project_name和ccount是标题
如何将此数据集读入networkx图并查看graph.ccount值是边的权重。我是python和networkx的新手。
这是已经做过的事情
import networkx as nx
import csv
def _get_graph_file():
G = nx.DiGraph()
git = csv.reader('file.csv')
G.add_weighted_edges_from(git_df)
return G
print(_get_graph_file())
答案 0 :(得分:0)
您需要将文件对象直接传递给csv阅读器而不是文件。添加具有边权重的节点的语法也是不正确的。您没有使用正确的变量名称。
以下是具有正确语法和格式的代码:
import networkx as nx
import csv
def _get_graph_file():
G = nx.DiGraph()
#Read the csv file
file_obj = open('file.csv')
#Pass the file object to csv reader
git = csv.reader(file_obj,delimiter=',')
#Ignore the headers
headers = git.next()
#Ignore the line between headers and actual data
git.next()
#git is the variable to be passed, not git_df
G.add_weighted_edges_from(git)
return G
my_graph = _get_graph_file()
#To get the list of nodes
print my_graph.nodes()
#To get the list of edges
print my_graph.edges()
#To get the weight between two edges
print my_graph['Ken']['heartbeat']['weight']
答案 1 :(得分:0)
Pandas的工作方式与您希望CSV模块一样有效,特别是考虑到您可能希望将ccount列转换为数字。它还内置支持在开始时跳过该空白行。
import networkx as nx
import csv
import pandas as pd
def _get_graph_file():
G= nx.DiGraph()
git = pd.read_csv('file.csv', skiprows=[1])
G.add_weighted_edges_from(git.values)
return G
G = _get_graph_file()
print(G['kts']['blocklist-ipsets'])
输出:
{'weight': 2192232}