Question

我有一个包含标题和行的数据集，格式为

login,project_name,ccount

kts,blocklist-ipsets,2192232
Ken,heartbeat,1477010
Code,feature-tests,584334
dat,mirage-ci-logs,584046
vip,committer,533839
RQHVMCAU,mirror,505976
ANHTMOTA,d43-en,378663
mikins,openapsdev,348992
direwolf-github,my-app,340046

其中login，project_name和ccount是标题
如何将此数据集读入networkx图并查看graph.ccount值是边的权重。我是python和networkx的新手。
这是已经做过的事情

import networkx as nx
import csv
def _get_graph_file():
   G = nx.DiGraph()
   git = csv.reader('file.csv')
   G.add_weighted_edges_from(git_df)
   return G
print(_get_graph_file())

Answer 1

您需要将文件对象直接传递给csv阅读器而不是文件。添加具有边权重的节点的语法也是不正确的。您没有使用正确的变量名称。

以下是具有正确语法和格式的代码：

import networkx as nx
import csv

def _get_graph_file():
   G = nx.DiGraph()

   #Read the csv file
   file_obj = open('file.csv')

   #Pass the file object to csv reader
   git = csv.reader(file_obj,delimiter=',')

   #Ignore the headers
   headers = git.next()

   #Ignore the line between headers and actual data
   git.next()

   #git is the variable to be passed, not git_df
   G.add_weighted_edges_from(git)

   return G

my_graph = _get_graph_file()

#To get the list of nodes
print my_graph.nodes()

#To get the list of edges
print my_graph.edges()

#To get the weight between two edges
print my_graph['Ken']['heartbeat']['weight']

Answer 2

Pandas的工作方式与您希望CSV模块一样有效，特别是考虑到您可能希望将ccount列转换为数字。它还内置支持在开始时跳过该空白行。

import networkx as nx
import csv
import pandas as pd

def _get_graph_file():
    G= nx.DiGraph()
    git = pd.read_csv('file.csv', skiprows=[1])
    G.add_weighted_edges_from(git.values)
    return G

G = _get_graph_file()
print(G['kts']['blocklist-ipsets'])

输出：

{'weight': 2192232}

无法将csv文件加载到networkx中

2 个答案: