我正在将图形对象写入xml表示。我的整体代码运行良好,但在我的大图上它太慢了。我正在尝试并行化它,但我没有从池中取回SubElement
。我确信我错过了一些明显的东西,但我是python的新手。
import networkx as nx
import lxml.etree as et
from multiprocessing import Pool
G = nx.petersen_graph()
# For any graph, make a node subelement with the id being the node label
def getNodeAttributes(index):
et.SubElement(nodes, "node", attrib={'id': str(G.nodes()[index])})
# Do it with one monolithic process
network = et.Element("network", attrib={"name": "Petersen Graph"})
nodes = et.SubElement(network, "nodes")
for i in range(len(G)):
getNodeAttributes(i)
et.dump(network)
<network name="Petersen Graph">
<nodes>
<node id="0"/>
<node id="1"/>
<node id="2"/>
<node id="3"/>
<node id="4"/>
<node id="5"/>
<node id="6"/>
<node id="7"/>
<node id="8"/>
<node id="9"/>
</nodes>
</network>
# Do it again, but with pool.map in parallel
network = et.Element("network", attrib={"name": "Petersen Graph"})
nodes = et.SubElement(network, "nodes")
pool = Pool(4)
pool.map(getNodeAttributes, range(len(G)))
pool.close()
pool.join()
et.dump(network)
<network name="Petersen Graph">
<nodes/>
</network>
答案 0 :(得分:1)
使用队列(multiprocessing.Queue
)收集工作进程的结果。请参阅此问题的答案:Sharing a result queue among several processes。
那就是说,我不确定它会对你的情况有多大帮助,因为需要按顺序读取和解析XML文件,并且元素树会非常大。但试一试......