解析.txt文件以获取关键字,然后寻找子关键字?

时间:2019-02-04 13:00:01

标签: python python-3.x

很抱歉,我真的不知道如何正确地表达这个问题。

我有一个脚本,该脚本从保存的Cisco交换机中提取配置到.txt文件,我想提取关键信息。以下是提取的.txt文件中内容的一部分:

<html>
<head>
<title>
Example
</title>
</head>
<body>
<h1>Hello, world</h1>
</body>
</html>

我希望从上述.txt文件中提取以下信息:

  • 界面信息(全部)
  • 说明(如果存在)
  • 频道组号(如果存在)

我希望的输出示例是:

!
!
!
!
interface Port-channel10
!
interface GigabitEthernet0/0
 description ** Uplink to C1 **
 switchport trunk allowed vlan 95
 switchport trunk encapsulation dot1q
 switchport mode trunk
 media-type rj45
 negotiation auto
!
interface GigabitEthernet0/1
 description ** Uplink to C2 **
 switchport trunk allowed vlan 95
 switchport trunk encapsulation dot1q
 switchport mode trunk
 media-type rj45
 negotiation auto
 channel-group 10 mode auto
!
interface GigabitEthernet0/2
 description ** Downlink to NetAuto **
  switchport access vlan 95
  switchport mode access
  media-type rj45
  negotiation auto
!
interface GigabitEthernet0/3
 switchport trunk encapsulation dot1q
 media-type rj45
 negotiation auto
 channel-group 10 mode auto
!
interface GigabitEthernet1/0
 media-type rj45
 negotiation auto
!
interface GigabitEthernet1/1
 media-type rj45

以此类推...

但是下面是我当前的代码,它没有给我任何我想要的东西,而我对Python的了解有限,我全都没主意:

interface GigabitEthernet0/0
 description ** Uplink to C1 **
interface GigabitEthernet0/1
 description ** Uplink to C2 **
 channel-group 10
interface GigabitEthernet0/2
 description ** Downlink to NetAuto **

这将返回:

with open('newfile1', 'r') as fi:
    int = []
    desc = []
    for ln in fi:
        if ln.startswith("interface"):
            int = (ln)
            print(int) 
            for ln in fi: 
                if ln.startswith(" description"): 
                    desc = (ln) 
                    print(desc) 

5 个答案:

答案 0 :(得分:1)

使用简单的迭代。

例如:

result = []
with open(filename) as infile:    #Filename == Your File 
    for line in infile:           #Iterate Each line
        line = line.strip()
        if line.startswith("interface GigabitEthernet"):   #Check condition
            result.append([line])
            while True:
                try:
                    line = next(infile).strip()
                except:  #Handle StopIteration Error
                    break
                if line == "!":
                    break
                if line.startswith("description"):   #Check condition
                    result[-1].append(line)
                if line.startswith("channel-group"):   #Check condition
                    result[-1].append(line)
print(result) 

输出:

[['interface GigabitEthernet0/0', 'description ** Uplink to C1 **'],
 ['interface GigabitEthernet0/1',
  'description ** Uplink to C2 **',
  'channel-group 10 mode auto'],
 ['interface GigabitEthernet0/2', 'description ** Downlink to NetAuto **'],
 ['interface GigabitEthernet0/3', 'channel-group 10 mode auto'],
 ['interface GigabitEthernet1/0'],
 ['interface GigabitEthernet1/1']]

答案 1 :(得分:1)

很好地构造数据以供使用非常重要。我建议您使用字典来存储每个接口的详细信息。因此,从文件中提取的数据将是此类词典的列表。相同的代码如下所示:

with open('test.txt', 'r') as file:
    data = []
    for line in file:
        if line.startswith('interface'):
            data.append(dict(interface=line.replace('interface', '').strip()))
            print(line) # check it on the console

        if line.strip().startswith('description'):
            data[-1]['description'] = line.replace('description', '').strip()
            print(line) # check it on the console

        if line.strip().startswith('channel-group'):
            data[-1]['channel-group'] = line.replace('channel-group', '').strip()
            print(line) # check it on the console

print(data) # prints a list of dicts

数据将是:

[{'interface': 'Port-channel10'}, {'interface': 'GigabitEthernet0/0', 'description': '** Uplink to C1 **'}, {'interface': 'GigabitEthernet0/1', 'description': '** Uplink to C2 **', 'channel-group': '10 mode auto'}, {'interface': 'GigabitEthernet0/2', 'description': '** Downlink to NetAuto **'}, {'interface': 'GigabitEthernet0/3', 'channel-group': '10 mode auto'}, {'interface': 'GigabitEthernet1/0'}, {'interface': 'GigabitEthernet1/1'}]

答案 2 :(得分:1)

保持简单-将文本文件拆分为行,将行拆分为单词,检查第一个单词是否在您感兴趣的单词列表中。

results = []
first_words = ['interface', 'description', 'channel-group']
input_file = 'switch.txt'

with open(input_file, 'r') as switch_file:
    for line in switch_file.readlines():
        words_in_line = line.split()
        # There should be at least 1 word in the line
        if 0 < len(words_in_line):
            first_word = words_in_line[0]
            if any(first_word in s for s in first_words):
                results.append(line.rstrip())

print("\n".join(results))

输出:

interface Port-channel10
interface GigabitEthernet0/0
 description ** Uplink to C1 **
interface GigabitEthernet0/1
 description ** Uplink to C2 **
 channel-group 10 mode auto
interface GigabitEthernet0/2
 description ** Downlink to NetAuto **
interface GigabitEthernet0/3
 channel-group 10 mode auto
interface GigabitEthernet1/0
interface GigabitEthernet1/1

答案 3 :(得分:0)

尝试:

first_words = {'interface', 'description', 'channel-group'}

res = []
with open('input.txt') as input_f:
    d = []
    first = True
    for i in filter(lambda l: l.strip().split(' ')[0] in {'interface', 'description', 'channel-group'}, input_f):
        if 'interface' in i:
            first = False if first else res.append(d)
            d = []
        d.append(i.strip())

答案 4 :(得分:0)

好方法

您可以用几行代码来实现,该解决方案是其他解决方案的详细说明,但实际上您需要的是build a Parser:它将绝对更干净,更可持续。

您可以在此处找到更多信息:

解决方法

如果您只需要一种快速的解决方法,则可以这样做:

import re
rx = re.compile(r'^(interface)|(description).*')

with open('test.txt', 'r') as f, open('result.txt', 'w+') as rf:
    result = [l for l in f if rx.match(l.strip())]
    rf.write(''.join(result))