由于尝试回答问题Graph isomorphism for jar files,关于如何使用Python将jar文件表示为图形,自然会产生争论。
问题:给定一个jar文件,读取其中包含的文件并创建内容表示为(a)数据结构和(b)图形,两者都适合进一步研究和操作,如例如,评估与另一个jar文件的同构。在图中,目录树应该是根节点和分支节点,以文件结尾作为叶节点。
为了标准化答案,我使用从this OpenProcessing sketch下载的verletphysics.jar
文件。
答案 0 :(得分:8)
鉴于jar文件基本上是压缩档案,使用Python标准库中的the zipfile
module来读取内容并准备jar内容关系的文本和图形表示。
对于问题中提到的文件verletphysics.jar
,下面的代码生成了这个内容列表:
META-INF/
META-INF/MANIFEST.MF
toxi/
toxi/physics/
toxi/physics/behaviors/
toxi/physics/constraints/
toxi/physics2d/
toxi/physics2d/behaviors/
toxi/physics2d/constraints/
toxi/physics/ParticlePath.class
toxi/physics/ParticleString.class
toxi/physics/PullBackString.class
toxi/physics/VerletConstrainedSpring.class
toxi/physics/VerletMinDistanceSpring.class
toxi/physics/VerletParticle.class
toxi/physics/VerletPhysics.class
toxi/physics/VerletSpring.class
toxi/physics/behaviors/AttractionBehavior.class
toxi/physics/behaviors/ConstantForceBehavior.class
toxi/physics/behaviors/GravityBehavior.class
toxi/physics/behaviors/ParticleBehavior.class
toxi/physics/constraints/AxisConstraint.class
toxi/physics/constraints/BoxConstraint.class
toxi/physics/constraints/CylinderConstraint.class
toxi/physics/constraints/MaxConstraint.class
toxi/physics/constraints/MinConstraint.class
toxi/physics/constraints/ParticleConstraint.class
toxi/physics/constraints/PlaneConstraint.class
toxi/physics/constraints/SoftBoxConstraint.class
toxi/physics/constraints/SphereConstraint.class
toxi/physics2d/ParticlePath2D.class
toxi/physics2d/ParticleString2D.class
toxi/physics2d/PullBackString2D.class
toxi/physics2d/VerletConstrainedSpring2D.class
toxi/physics2d/VerletMinDistanceSpring2D.class
toxi/physics2d/VerletParticle2D.class
toxi/physics2d/VerletPhysics2D.class
toxi/physics2d/VerletSpring2D.class
toxi/physics2d/behaviors/AttractionBehavior.class
toxi/physics2d/behaviors/ConstantForceBehavior.class
toxi/physics2d/behaviors/GravityBehavior.class
toxi/physics2d/behaviors/ParticleBehavior2D.class
toxi/physics2d/constraints/AngularConstraint.class
toxi/physics2d/constraints/AxisConstraint.class
toxi/physics2d/constraints/CircularConstraint.class
toxi/physics2d/constraints/MaxConstraint.class
toxi/physics2d/constraints/MinConstraint.class
toxi/physics2d/constraints/ParticleConstraint2D.class
toxi/physics2d/constraints/RectConstraint.class
verletphysics.mf
上述路径名中的每个节点都被提取并由代码赋予唯一ID,如下所示:
Index File
0 behaviors
1 BoxConstraint.class
2 MaxConstraint.class
3 VerletParticle.class
4 ParticleConstraint2D.class
5 ConstantForceBehavior.class
6 META-INF
7 VerletMinDistanceSpring2D.class
8 AxisConstraint.class
9 AttractionBehavior.class
10 physics2d
11 VerletPhysics.class
12 PullBackString.class
13 VerletSpring.class
14 VerletConstrainedSpring.class
15 ParticleString2D.class
16 verletphysics.mf
17 ParticleBehavior2D.class
18 ParticleString.class
19 RectConstraint.class
20 CylinderConstraint.class
21 toxi
22 VerletMinDistanceSpring.class
23 VerletSpring2D.class
24 VerletParticle2D.class
25 ParticlePath2D.class
26 CircularConstraint.class
27 ParticlePath.class
28 MinConstraint.class
29 MANIFEST.MF
30 ParticleConstraint.class
31 GravityBehavior.class
32 VerletPhysics2D.class
33 SoftBoxConstraint.class
34 ParticleBehavior.class
35 VerletConstrainedSpring2D.class
36 PlaneConstraint.class
37 PullBackString2D.class
38 SphereConstraint.class
39 physics
40 AngularConstraint.class
41 constraints
路径名将转换为使用NetworkX构建到此网络中的边缘,并使用matplotlib绘制。
import zipfile
import networkx as nx
import matplotlib.pyplot as plt
# Download the code from
# http://www.openprocessing.org/sketch/46757
# Unzip and find the jar file: verletphysics.jar
# This example uses that file for demo
def get_edges(fName):
edges = []
nodes = []
jar = zipfile.ZipFile(fName, "r")
for name in jar.namelist():
print name # prints the list of files in the jar
if name.endswith('/'): name = name[:-1]
parts = name.split('/')
nodes.extend( parts )
if len(parts) > 1:
edges += zip(nodes[:-1], nodes[1:])
nodes = set(nodes)
nodes = dict( zip(nodes, range(len(nodes)) ) )
edges = [ (nodes[ edge[0] ], nodes[ edge[1] ])
for edge in edges ]
nodes = [ (index, label) for label, index in nodes.iteritems() ]
nodes = sorted( nodes, key = lambda node: node[0] )
return set( edges ), nodes
if __name__ == '__main__':
fName = 'verletphysics.jar'
edges, nodes = get_edges(fName)
# print list of nodes
# serving as a key to the graph
print '%10s %s' % ('Index', 'File')
for node in nodes:
print '%10s %s' % (node[0], node[1])
# Plot the network graph
G = nx.Graph()
G.add_edges_from( edges )
nx.draw_networkx(G, pos=nx.spring_layout(G))
plt.axis('off')
plt.show()