带有重复元素的文本文件到JSON

时间:2016-01-23 03:04:04

标签: python json

我想生成一个JSON文件,然后我可以使用像D3这样的工具来理解这个新网络元素的不同命令。

Here is a sample of how each command will look like.

示例中的每一行都是一个命令。得到一个"树"非常有用。就像分组的每个命令的输出格式一样。

我试过了:

import sys, time, thread, re, commands, os, json

Extract = []
PL = ""

def DoIt(text,PL):
     if not text.strip() == PL.strip():
        if not text.strip() == "":
            Extract.append(text.split())
            PL = text.strip()


def ExecuteCmd(String):
    (status,Data)=commands.getstatusoutput(String)
    print ".", 

PL = ""
LINES = open( 'pana', 'r' ).readlines()  # Pana = https://en.wikipedia.org/wiki/Venezuelan_Spanish
for i in LINES:   
    DoIt(i,PL)
    PL = i


ExecuteCmd("rm -rfv TreeTmp" )
ExecuteCmd("mkdir TreeTmp" )

for j in range((len(Extract))):
    Command = "" 
    for i in range((len(Extract[j]))):
        lASn = Extract[j][i]
        MyStr = "<"
        ReSearch  = re.search(MyStr + "(.*)", lASn)
        if ReSearch:
           lASn =  "NA-VA"

        Command = Command + "/" + lASn
        ExecuteCmd("mkdir TreeTmp" + Command )


def path_to_dict(path):
    d = {'name': os.path.basename(path)}
    if os.path.isdir(path):
        d['children'] = [path_to_dict(os.path.join(path,x)) for x in os.listdir(path)]
    else:
        d['type'] = "file"
    return d

with open('flare.json', 'w') as outfile:
    json.dump(path_to_dict('./TreeTmp/'), outfile)

我已经在StackExchange中进行了大量的搜索,可能将其解析为二叉树,然后将其解析回来,或者创建Arrays over Arrays的元素,或者遍历创建的目录树(上面的破解代码)。我真的遇到了获得正确逻辑的问题。或者是最好的方法。

1 个答案:

答案 0 :(得分:0)

如果您只是寻找独特的订单并且订单并不重要,那么您可以在Python中使用字典词典获得一个很好的树表示,几乎不费力。

def add_items(tree, items):
    if items[0] not in tree:
        branch = {}
        tree[items[0]] = branch
    else:
        branch = tree[items[0]]
    if len(items) > 1:
        add_items(branch, items[1:])    

tree = {}
with open('commands.txt') as f:
    for line in f:
        items = line.strip().split()
        add_items(tree, items)

from pprint import pprint
pprint(tree)

输出:

{'param-1': {'param-2': {'param-X': {'gateway': {'<name>': {'protocol': {'param-Xv1': {'dpd': {}}},
                                                            'protocol-common': {'fragmentation': {},
                                                                                'nat-traversal': {}}}},
                                     'param-Z': {'ipsec-param-Z': {'<name>': {'ah': {'authentication': {}},
                                                                              'esp': {'authentication': {},
                                                                                      'encryption': {}},
                                                                              'lifesize': {},
                                                                              'lifetime': {}}},
                                                 'param-X-param-Z': {'<name>': {'A': {},
                                                                                'B': {},
                                                                                'C': {},
                                                                                'D': {}}}}}},
             'param-3': {'param-3': {'ntp-servers': {},
                                     'param-5': {'param-6': {}},
                                     'update-schedule': {'global-protect-datafile': {'recurring': {'weekly': {}}}}}}}}

如果您必须具有您在问题中指定的格式,并使用名称和子标记,则应该有效:

class Element(object):
    def __init__(self, name, children=None):
        self.name = name    
        self.children = [] if children is None else children
    def to_dict(self):
        d = {'name': self.name}
        if self.children:
            d['children'] = [c.to_dict() for c in self.children]
        return d

def add_items(children, items):
    head, *tail = items
    for child in children:
        if child.name == head:
            break
    else:
        child = Element(head)
        children.append(child)
    if tail:
        add_items(child.children, tail)

root = Element('root')

with open('commands.txt') as f:
    for line in f:
        items = line.strip().split()
        add_items(root.children, items)

输出:

{'children': [{'children': [{'children': [{'children': [{'children': [{'name': 'param-6'}],
         'name': 'param-5'},
        {'name': 'ntp-servers'},
        {'children': [{'children': [{'children': [{'name': 'weekly'}],
             'name': 'recurring'}],
           'name': 'global-protect-datafile'}],
         'name': 'update-schedule'}],
       'name': 'param-3'}],
     'name': 'param-3'},
    {'children': [{'children': [{'children': [{'children': [{'children': [{'children': [{'name': 'dpd'}],
               'name': 'param-Xv1'}],
             'name': 'protocol'},
            {'children': [{'name': 'nat-traversal'},
              {'name': 'fragmentation'}],
             'name': 'protocol-common'}],
           'name': '<name>'}],
         'name': 'gateway'},
        {'children': [{'children': [{'children': [{'name': 'A'},
              {'name': 'B'},
              {'name': 'C'},
              {'name': 'D'}],
             'name': '<name>'}],
           'name': 'param-X-param-Z'},
          {'children': [{'children': [{'children': [{'name': 'encryption'},
                {'name': 'authentication'}],
               'name': 'esp'},
              {'children': [{'name': 'authentication'}], 'name': 'ah'},
              {'name': 'lifetime'},
              {'name': 'lifesize'}],
             'name': '<name>'}],
           'name': 'ipsec-param-Z'}],
         'name': 'param-Z'}],
       'name': 'param-X'}],
     'name': 'param-2'}],
   'name': 'param-1'}],
 'name': 'root'}