Question

我有一个像这样的文件

grouping data-rate-parameters {
    description
      "Data rate configuration parameters.";
    reference
      "ITU-T G.997.2 clause 7.2.1.";

    leaf maximum-net-data-rate {
      type bbf-yang:data-rate32;
      default "4294967295";
      description
        "Defines the value of the maximum net data rate (see clause
         11.4.2.2/G.9701).";
      reference
        "ITU-T G.997.2 clause 7.2.1.1 (MAXNDR).";
    }

      leaf psd-level {
        type psd-level;
        description
          "The PSD level of the referenced sub-carrier.";
      }
    }
  }

  grouping line-spectrum-profile {
    description
      "Defines the parameters contained in a line spectrum
       profile.";

    leaf profiles {
      type union {
        type enumeration {
          enum "all" {
            description
              "Used to indicate that all profiles are allowed.";
          }
        }
        type profiles;
      }

这里我想提取每个叶块。例如，叶子最大净数据速率块是

leaf maximum-net-data-rate {
          type bbf-yang:data-rate32;
          default "4294967295";
          description
            "Defines the value of the maximum net data rate (see clause
             11.4.2.2/G.9701).";
          reference
            "ITU-T G.997.2 clause 7.2.1.1 (MAXNDR).";
        }

像这样我想提取

我尝试使用此代码，这里基于大括号的计数（'{'）我试图读取块

with open(r'file.txt','r') as f:
    leaf_part = []
    count = 0
    c = 'psd-level'
    for line in f:
        if 'leaf %s {'%c in line:
                    cur_line=line
                    for line in f:
                        pre_line=cur_line
                        cur_line=line
                        if '{' in pre_line:
                            leaf_part.append(pre_line)
                            count+=1
                        elif '}' in pre_line:
                            leaf_part.append(pre_line)
                            count-=1
                        elif count==0:
                            break
                        else:
                            leaf_part.append(pre_line)

它适用于leaf maximum-net-data-rate，但不适用于leaf psd-level

在为leaf psd-level做的时候，它也会显示出块行。

帮助我完成这项任务。

Answer 1

您可以使用正则表达式：

import re
reg = re.compile(r"leaf.+?\{.+?\}", re.DOTALL)
reg.findall(file)

返回所有匹配块的数组如果要搜索特定的叶名称，可以使用格式（请记住加倍大括号）：

leafname = "maximum-net-data-rate"
reg = re.compile(r"leaf\s{0}.+?\{{.+?\}}".format(temp), re.DOTALL)

编辑：for python 2.7

reg = re.compile(r"leaf\s%s.+?\{.+?\}" %temp, re.DOTALL)

EDIT2：完全错过了你上一个例子中的嵌套括号这个解决方案比简单的正则表达式更复杂，所以你可以考虑另一种方法。不过，有可能这样做首先，您需要安装regex模块，因为内置re不支持递归模式。

pip install regex

第二，这是你的模式

import regex
reg = regex.compile(r"(leaf.*?)({(?>[^\{\}]|(?2))*})", regex.DOTALL)
reg.findall(file)

现在，此模式将返回元组列表，因此您可能希望执行类似这样的操作

res = [el[0]+el[1] for el in reg.findall(file)]

这应该会为您提供完整结果列表。

Answer 2

它只需要在你的休息循环中进行简单的编辑，因为有多个结束括号＆＃39;}＆＃39;你的计数已经是负数，因此你需要用

改变那一行

elif count<=0:
    break

但它仍然在你的列表中附加多个大括号，所以你可以通过保留开括号的记录来处理它，我改变了代码如下：

with open(r'file.txt','r') as f:
    leaf_part = []
    braces_record = []
    count = 0
    c = 'psd-level'
    for line in f:
        if 'leaf %s {'%c in line:
            braces_record.append('{')
            cur_line=line
            for line in f:
                pre_line=cur_line
                cur_line=line
                if '{' in pre_line:
                    braces_record.append('{')
                    leaf_part.append(pre_line)
                    count+=1
                elif '}' in pre_line:
                    try:
                        braces_record.pop()
                        if len(braces_record)>0:
                            leaf_part.append(pre_line)
                    except:
                        pass
                    count-=1
                elif count<=0:
                    break
                elif '}' not in pre_line:
                    leaf_part.append(pre_line)

上述代码的结果：

      leaf psd-level {
        type psd-level;
        description
          "The PSD level of the referenced sub-carrier.";
  }

如何使用python

2 个答案: