如何将jar文件表示为网络图?

时间:2012-06-11 18:29:12

标签: java python graph jar networkx

由于尝试回答问题Graph isomorphism for jar files,关于如何使用Python将jar文件表示为图形,自然会产生争论。

问题:给定一个jar文件,读取其中包含的文件并创建内容表示为(a)数据结构和(b)图形,两者都适合进一步研究和操作,如例如,评估与另一个jar文件的同构。在图中,目录树应该是根节点和分支节点,以文件结尾作为叶节点。

为了标准化答案,我使用从this OpenProcessing sketch下载的verletphysics.jar文件。

1 个答案:

答案 0 :(得分:8)

解决方案

鉴于jar文件基本上是压缩档案,使用Python标准库中的the zipfile module来读取内容并准备jar内容关系的文本和图形表示。

文字表示

对于问题中提到的文件verletphysics.jar,下面的代码生成了这个内容列表:

META-INF/
META-INF/MANIFEST.MF
toxi/
toxi/physics/
toxi/physics/behaviors/
toxi/physics/constraints/
toxi/physics2d/
toxi/physics2d/behaviors/
toxi/physics2d/constraints/
toxi/physics/ParticlePath.class
toxi/physics/ParticleString.class
toxi/physics/PullBackString.class
toxi/physics/VerletConstrainedSpring.class
toxi/physics/VerletMinDistanceSpring.class
toxi/physics/VerletParticle.class
toxi/physics/VerletPhysics.class
toxi/physics/VerletSpring.class
toxi/physics/behaviors/AttractionBehavior.class
toxi/physics/behaviors/ConstantForceBehavior.class
toxi/physics/behaviors/GravityBehavior.class
toxi/physics/behaviors/ParticleBehavior.class
toxi/physics/constraints/AxisConstraint.class
toxi/physics/constraints/BoxConstraint.class
toxi/physics/constraints/CylinderConstraint.class
toxi/physics/constraints/MaxConstraint.class
toxi/physics/constraints/MinConstraint.class
toxi/physics/constraints/ParticleConstraint.class
toxi/physics/constraints/PlaneConstraint.class
toxi/physics/constraints/SoftBoxConstraint.class
toxi/physics/constraints/SphereConstraint.class
toxi/physics2d/ParticlePath2D.class
toxi/physics2d/ParticleString2D.class
toxi/physics2d/PullBackString2D.class
toxi/physics2d/VerletConstrainedSpring2D.class
toxi/physics2d/VerletMinDistanceSpring2D.class
toxi/physics2d/VerletParticle2D.class
toxi/physics2d/VerletPhysics2D.class
toxi/physics2d/VerletSpring2D.class
toxi/physics2d/behaviors/AttractionBehavior.class
toxi/physics2d/behaviors/ConstantForceBehavior.class
toxi/physics2d/behaviors/GravityBehavior.class
toxi/physics2d/behaviors/ParticleBehavior2D.class
toxi/physics2d/constraints/AngularConstraint.class
toxi/physics2d/constraints/AxisConstraint.class
toxi/physics2d/constraints/CircularConstraint.class
toxi/physics2d/constraints/MaxConstraint.class
toxi/physics2d/constraints/MinConstraint.class
toxi/physics2d/constraints/ParticleConstraint2D.class
toxi/physics2d/constraints/RectConstraint.class
verletphysics.mf

钥匙

上述路径名中的每个节点都被提取并由代码赋予唯一ID,如下所示:

 Index  File
     0  behaviors
     1  BoxConstraint.class
     2  MaxConstraint.class
     3  VerletParticle.class
     4  ParticleConstraint2D.class
     5  ConstantForceBehavior.class
     6  META-INF
     7  VerletMinDistanceSpring2D.class
     8  AxisConstraint.class
     9  AttractionBehavior.class
    10  physics2d
    11  VerletPhysics.class
    12  PullBackString.class
    13  VerletSpring.class
    14  VerletConstrainedSpring.class
    15  ParticleString2D.class
    16  verletphysics.mf
    17  ParticleBehavior2D.class
    18  ParticleString.class
    19  RectConstraint.class
    20  CylinderConstraint.class
    21  toxi
    22  VerletMinDistanceSpring.class
    23  VerletSpring2D.class
    24  VerletParticle2D.class
    25  ParticlePath2D.class
    26  CircularConstraint.class
    27  ParticlePath.class
    28  MinConstraint.class
    29  MANIFEST.MF
    30  ParticleConstraint.class
    31  GravityBehavior.class
    32  VerletPhysics2D.class
    33  SoftBoxConstraint.class
    34  ParticleBehavior.class
    35  VerletConstrainedSpring2D.class
    36  PlaneConstraint.class
    37  PullBackString2D.class
    38  SphereConstraint.class
    39  physics
    40  AngularConstraint.class
    41  constraints

图表

路径名将转换为使用NetworkX构建到此网络中的边缘,并使用matplotlib绘制。

network graph of jar file contents

守则

import zipfile
import networkx as nx
import matplotlib.pyplot as plt

# Download the code from
# http://www.openprocessing.org/sketch/46757
# Unzip and find the jar file: verletphysics.jar
# This example uses that file for demo

def get_edges(fName):
    edges = []
    nodes = []

    jar = zipfile.ZipFile(fName, "r")
    for name in jar.namelist():
        print name # prints the list of files in the jar
        if name.endswith('/'): name = name[:-1]
        parts = name.split('/')
        nodes.extend( parts )
        if len(parts) > 1:
            edges += zip(nodes[:-1], nodes[1:]) 

    nodes = set(nodes)
    nodes = dict( zip(nodes, range(len(nodes)) ) )
    edges = [ (nodes[ edge[0] ], nodes[ edge[1] ])
              for edge in edges ]
    nodes = [ (index, label) for label, index in nodes.iteritems() ]
    nodes = sorted( nodes, key = lambda node: node[0] )
    return set( edges ), nodes

if __name__ == '__main__':
    fName = 'verletphysics.jar'
    edges, nodes = get_edges(fName)

    # print list of nodes
    # serving as a key to the graph
    print '%10s  %s' % ('Index', 'File')
    for node in nodes:
        print '%10s  %s' % (node[0], node[1])

    # Plot the network graph 
    G = nx.Graph()
    G.add_edges_from( edges )
    nx.draw_networkx(G, pos=nx.spring_layout(G))
    plt.axis('off')
    plt.show()