从CSV文件中的Adjacency Matrix绘制NetworkX图

时间:2015-04-11 00:24:39

标签: python csv numpy networkx

我现在一直在与这个问题作斗争,我知道这很简单 - 但我对Python或NetworkX没什么经验。我的问题非常简单,我试图绘制一个大型数据集(大约200行/列)的矩阵,看起来像这样。第一行和第一列是相同的。

  A,B,C,D,E,F,G,H,I,J,K
A,0,1,1,0,1,1,1,1,0,1,0
B,1,0,0,0,1,1,1,1,0,1,0
C,1,0,0,0,1,1,1,1,0,1,0

它只是一个显示人们如何连接的矩阵, 我想要的只是导入和绘制这个csv文件,并在NetworkX中显示相应的标签。

我有这个文件(people.cs v),并查看以前的答案here,看来最好的方法是将数据放在一个numpy的数组中。

这似乎有问题:

import numpy as np
import networkx as nx
import matplotlib.pyplot as plt
from numpy import genfromtxt
import numpy as np

mydata = genfromtxt('mouse.csv', delimiter=',')

我得到以下输出:

File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/lib/npyio.py", line 1272, in genfromtxt
  fhd = iter(np.lib._datasource.open(fname, 'rbU'))
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/lib/_datasource.py", line 145, in open
  return ds.open(path, mode)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/lib/_datasource.py", line 472, in open
  found = self._findfile(path)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/lib/_datasource.py", line 323, in _findfile
  if self.exists(name):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/lib/_datasource.py", line 417, in exists
  from urllib2 import urlopen
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 94, in <module>
  import httplib
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 69, in <module>
  from array import array
      File "/Users/Plosslab/Documents/PythonStuff/array.py", line 4, in <module>
      NameError: name 'array' is not defined

3 个答案:

答案 0 :(得分:18)

我创建了一个名为mycsv.csv的小型csv,其中包含以下内容:

,a,b,c,d
a,0,1,0,1
b,1,0,1,0
c,0,1,0,1
d,1,0,1,0

你没有&#39;&#39;作为第一行的第一个字符,但你有一个空格,所以如果这是我的错误让我知道。一般的想法是一样的。请阅读csv:

from numpy import genfromtxt
import numpy as np
mydata = genfromtxt('mycsv.csv', delimiter=',')
print(mydata)
print(type(mydata))

打印:

[[ nan  nan  nan  nan  nan]
 [ nan   0.   1.   0.   1.]
 [ nan   1.   0.   1.   0.]
 [ nan   0.   1.   0.   1.]
 [ nan   1.   0.   1.   0.]]
<type 'numpy.ndarray'>

现在我们将csv作为一个numpy数组读入,我们只需要提取邻接矩阵:

adjacency = mydata[1:,1:]
print(adjacency)

打印:

[[ 0.  1.  0.  1.]
 [ 1.  0.  1.  0.]
 [ 0.  1.  0.  1.]
 [ 1.  0.  1.  0.]]

如果我的小例子与你的完全一样,你可以根据需要切割你的numpy数组。

要绘制图形,您需要导入matplotlib和networkx:

import matplotlib.pyplot as plt
import networkx as nx

def show_graph_with_labels(adjacency_matrix, mylabels):
    rows, cols = np.where(adjacency_matrix == 1)
    edges = zip(rows.tolist(), cols.tolist())
    gr = nx.Graph()
    gr.add_edges_from(edges)
    nx.draw(gr, node_size=500, labels=mylabels, with_labels=True)
    plt.show()

show_graph_with_labels(adjacency, make_label_dict(get_labels('mycsv.csv')))

这是使用python的图表上的简短tutorial

graph from csv

答案 1 :(得分:12)

使用pandasnetworkx可以轻松完成此操作。

例如,我创建了一个名为csv的小test.csv文件作为

A,B,C,D,E,F,G,H,I,J,K
A,0,1,1,0,1,1,1,1,0,1,0
B,1,0,0,0,1,1,1,1,0,1,0
C,1,0,0,0,1,1,1,1,0,1,0
D,0,0,0,0,1,0,1,1,0,1,0
E,1,0,0,0,1,1,1,1,0,1,0
F,0,0,1,0,1,0,0,0,0,1,0
G,1,0,0,0,0,0,0,1,0,0,0
H,1,0,0,0,1,1,1,0,0,1,0
I,0,0,0,1,0,0,0,0,0,0,0
J,1,0,0,0,1,1,1,1,0,1,0
K,1,0,0,0,1,0,1,0,0,1,0

您可以阅读此csv文件并按如下所示创建图表

import pandas as pd
import networkx as nx
input_data = pd.read_csv('test.csv', index_col=0)
G = nx.DiGraph(input_data.values)

要绘制此图表,请使用

nx.draw(G)

你会得到类似于此的情节。

Output of <code>nx.draw(G)</code>

答案 2 :(得分:0)

这与Scott's excellent answer相同,但是可以正确处理没有边缘的节点。

import matplotlib.pyplot as plt
import networkx as nx

def show_graph_with_labels(adjacency_matrix, mylabels):
    rows, cols = np.where(adjacency_matrix == 1)
    edges = zip(rows.tolist(), cols.tolist())
    gr = nx.Graph()
    all_rows = range(0, adjacency_matrix.shape[0])
    for n in all_rows:
        gr.add_node(n)
    gr.add_edges_from(edges)
    nx.draw(gr, node_size=900, labels=mylabels, with_labels=True)
    plt.show()