使用python将具有嵌套循环结构的文件解析为列表结构

时间:2015-08-10 14:41:17

标签: python parsing pyparsing

我正在努力解析FPGA仿真文件(.vwf),特别是在使用一种嵌套循环系统指定输入波形的位置。文件格式的一个示例是:

TRANSITION_LIST("ADDR[0]")
{
    NODE
    {
        REPEAT = 1;
        LEVEL 0 FOR 100.0;
        LEVEL 1 FOR 100.0;
        NODE
        {
            REPEAT = 3;
            LEVEL 0 FOR 100.0;
            LEVEL 1 FOR 100.0;
            NODE
            {
                REPEAT = 2;
                LEVEL 0 FOR 200.0;
                LEVEL 1 FOR 200.0;
            }
        }
        LEVEL 0 FOR 100.0;
    }
}

这意味着名为“ADDR [0]”的通道的逻辑值切换如下:

LEVEL 0 FOR 100.0;
LEVEL 1 FOR 100.0;
LEVEL 0 FOR 100.0;
LEVEL 1 FOR 100.0;
LEVEL 0 FOR 200.0;
LEVEL 1 FOR 200.0;
LEVEL 0 FOR 200.0;
LEVEL 1 FOR 200.0;
LEVEL 0 FOR 100.0;
LEVEL 1 FOR 100.0;
LEVEL 0 FOR 200.0;
LEVEL 1 FOR 200.0;
LEVEL 0 FOR 200.0;
LEVEL 1 FOR 200.0;
LEVEL 0 FOR 100.0;
LEVEL 1 FOR 100.0;
LEVEL 0 FOR 200.0;
LEVEL 1 FOR 200.0;
LEVEL 0 FOR 200.0;
LEVEL 1 FOR 200.0;
LEVEL 0 FOR 100.0;

我已着手尝试将此信息转换为如下所示的列表结构:

[[0, 100], [1, 100], [0, 100], [1, 100], [0, 200], [1, 200], [0, 200], [1, 200], [0, 100], [1, 100], [0, 200], [1, 200], [0, 200], [1, 200], [0, 100], [1, 100], [0, 200], [1, 200], [0, 200], [1, 200], [0, 200]]

然而,我正在努力想出如何做到这一点。我尝试了一些我曾经工作的东西,但在重新审视它时我发现了我的错误。

import pyparsing as pp


def get_data(LINES):
    node_inst = []
    total_inst = []
    r = []
    c = 0

    rep_search = pp.Literal('REPEAT = ') + pp.Word(pp.nums)
    log_search = pp.Literal('LEVEL') + pp.Word('01') + pp.Literal('FOR') + pp.Word(pp.nums + '.')
    bra_search = pp.Literal('}')

    for line in LINES:
        print(line)
        rep = rep_search.searchString(line)
        log = log_search.searchString(line)
        bra = bra_search.searchString(line)

        if rep:
            #print(line)
            c += 1
            if c > 1: # no logic values have been found when c == 1
                for R in range(r[-1]):
                    for n in node_inst:
                        total_inst.append(n)
                node_inst = []
            r.append(int(rep[0][-1]))

        elif log:
            #print(line)
            node_inst.append([int(log[0][1]),
                              int(round(1000 * float(log[0][-1])))])

        elif bra:
            #print(line)
            if node_inst:
                for R in range(r[-1]):
                    for n in node_inst:
                        total_inst.append(n)
                node_inst = []
            if r:
                del r[-1]

    return total_inst

其中r是一个跟踪重复值的列表,但如果遇到'}'则删除最后一个值。这会产生一些接近但循环中重复2次的任何值只会重复2次,而不是循环的一部分,重复3次。

任何帮助或提示将不胜感激。我只是画了一些空白的脚本。可以更改与我的代码有关的任何内容,但输入文件格式不能让我知道。

1 个答案:

答案 0 :(得分:2)

类似的东西,请考虑它在很大程度上取决于格式化。

import re

class Node:
    def __init__(self, parent, items, repeat):
        self.parent = parent
        self.items = items
        self.repeat = repeat

root = Node(None, [], 1)
node = root

with open('levels.txt') as f:
    for line in f:
        line = line.strip()
        if line == 'NODE':
            new_node = Node(node, [], 1)
            node.items.append(new_node)
            node = new_node

        if line == '}':
            node = node.parent

        res = re.match('REPEAT = (\d+)', line)
        if res:
            node.repeat=int(res.group(1))

        res = re.match('LEVEL (\d+) FOR ([\d.]+)', line)
        if res:
            node.items.append((int(res.group(1)), float(res.group(2))))


def generate(node):
    res = []
    for i in xrange(node.repeat):
        for item in node.items:
            if isinstance(item, Node):
                res.extend(generate(item))
            elif isinstance(item, tuple):
                res.append(item)
    return res

res = generate(root)

for r in res:
    print r