我现在一直在与这个问题作斗争,我知道这很简单 - 但我对Python或NetworkX没什么经验。我的问题非常简单,我试图绘制一个大型数据集(大约200行/列)的矩阵,看起来像这样。第一行和第一列是相同的。
A,B,C,D,E,F,G,H,I,J,K
A,0,1,1,0,1,1,1,1,0,1,0
B,1,0,0,0,1,1,1,1,0,1,0
C,1,0,0,0,1,1,1,1,0,1,0
它只是一个显示人们如何连接的矩阵, 我想要的只是导入和绘制这个csv文件,并在NetworkX中显示相应的标签。
我有这个文件(people.cs
v),并查看以前的答案here,看来最好的方法是将数据放在一个numpy的数组中。
这似乎有问题:
import numpy as np
import networkx as nx
import matplotlib.pyplot as plt
from numpy import genfromtxt
import numpy as np
mydata = genfromtxt('mouse.csv', delimiter=',')
我得到以下输出:
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/lib/npyio.py", line 1272, in genfromtxt
fhd = iter(np.lib._datasource.open(fname, 'rbU'))
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/lib/_datasource.py", line 145, in open
return ds.open(path, mode)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/lib/_datasource.py", line 472, in open
found = self._findfile(path)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/lib/_datasource.py", line 323, in _findfile
if self.exists(name):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/lib/_datasource.py", line 417, in exists
from urllib2 import urlopen
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 94, in <module>
import httplib
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 69, in <module>
from array import array
File "/Users/Plosslab/Documents/PythonStuff/array.py", line 4, in <module>
NameError: name 'array' is not defined
答案 0 :(得分:18)
我创建了一个名为mycsv.csv的小型csv,其中包含以下内容:
,a,b,c,d
a,0,1,0,1
b,1,0,1,0
c,0,1,0,1
d,1,0,1,0
你没有&#39;&#39;作为第一行的第一个字符,但你有一个空格,所以如果这是我的错误让我知道。一般的想法是一样的。请阅读csv:
from numpy import genfromtxt
import numpy as np
mydata = genfromtxt('mycsv.csv', delimiter=',')
print(mydata)
print(type(mydata))
打印:
[[ nan nan nan nan nan]
[ nan 0. 1. 0. 1.]
[ nan 1. 0. 1. 0.]
[ nan 0. 1. 0. 1.]
[ nan 1. 0. 1. 0.]]
<type 'numpy.ndarray'>
现在我们将csv作为一个numpy数组读入,我们只需要提取邻接矩阵:
adjacency = mydata[1:,1:]
print(adjacency)
打印:
[[ 0. 1. 0. 1.]
[ 1. 0. 1. 0.]
[ 0. 1. 0. 1.]
[ 1. 0. 1. 0.]]
如果我的小例子与你的完全一样,你可以根据需要切割你的numpy数组。
要绘制图形,您需要导入matplotlib和networkx:
import matplotlib.pyplot as plt
import networkx as nx
def show_graph_with_labels(adjacency_matrix, mylabels):
rows, cols = np.where(adjacency_matrix == 1)
edges = zip(rows.tolist(), cols.tolist())
gr = nx.Graph()
gr.add_edges_from(edges)
nx.draw(gr, node_size=500, labels=mylabels, with_labels=True)
plt.show()
show_graph_with_labels(adjacency, make_label_dict(get_labels('mycsv.csv')))
这是使用python的图表上的简短tutorial。
答案 1 :(得分:12)
使用pandas
和networkx
可以轻松完成此操作。
例如,我创建了一个名为csv
的小test.csv
文件作为
A,B,C,D,E,F,G,H,I,J,K
A,0,1,1,0,1,1,1,1,0,1,0
B,1,0,0,0,1,1,1,1,0,1,0
C,1,0,0,0,1,1,1,1,0,1,0
D,0,0,0,0,1,0,1,1,0,1,0
E,1,0,0,0,1,1,1,1,0,1,0
F,0,0,1,0,1,0,0,0,0,1,0
G,1,0,0,0,0,0,0,1,0,0,0
H,1,0,0,0,1,1,1,0,0,1,0
I,0,0,0,1,0,0,0,0,0,0,0
J,1,0,0,0,1,1,1,1,0,1,0
K,1,0,0,0,1,0,1,0,0,1,0
您可以阅读此csv文件并按如下所示创建图表
import pandas as pd
import networkx as nx
input_data = pd.read_csv('test.csv', index_col=0)
G = nx.DiGraph(input_data.values)
要绘制此图表,请使用
nx.draw(G)
你会得到类似于此的情节。
答案 2 :(得分:0)
这与Scott's excellent answer相同,但是可以正确处理没有边缘的节点。
import matplotlib.pyplot as plt
import networkx as nx
def show_graph_with_labels(adjacency_matrix, mylabels):
rows, cols = np.where(adjacency_matrix == 1)
edges = zip(rows.tolist(), cols.tolist())
gr = nx.Graph()
all_rows = range(0, adjacency_matrix.shape[0])
for n in all_rows:
gr.add_node(n)
gr.add_edges_from(edges)
nx.draw(gr, node_size=900, labels=mylabels, with_labels=True)
plt.show()