如果模式在Python中匹配,则从文件中提取数据

时间:2015-03-05 11:30:25

标签: python

在包含数据的文件中:

startTc:TC9

Client-1
IPAddress:10.203.205.111
Port:22
endTc:TC9

------------------------------------------------
startTc:TC5
Client-2
IPAddress of Client-2:10.203.205.112
Port:23
endTc:TC5
------------------------------------------------

如果startTc:TC5的条件匹配

的数据
Client-2
IPAddress of Client-2:10.203.205.112
Port:23
需要像Port中的23那样提取

: 文件阅读需要在看到endTc:TC5

时关闭

2 个答案:

答案 0 :(得分:2)

一种方法是使用正则表达式,在以下模式中我使用positive look-around来匹配startTc:TC5\n\nendTc:TC5之间的字符串,然后您可以使用\n拆分结果:< / p>

>>> s="""startTc:TC9
... 
... Client-1
... IPAddress:10.203.205.111
... Port:22
... endTc:TC9
... 
... ------------------------------------------------
... startTc:TC5
... Client-2
... IPAddress of Client-2:10.203.205.112
... Port:23
... endTc:TC5
... ------------------------------------------------"""
>>> re.search(r'(?<=startTc:TC5\n).*(?=\nendTc:TC5)',s,re.DOTALL).group(0).split('\n')
['Client-2', 'IPAddress of Client-2:10.203.205.112', 'Port:23']

请注意,如果您想从文件中读取此字符串,则需要在open('file_name').read()函数中使用s代替re.search

答案 1 :(得分:0)

def getData(infilepath, start, end):
    with open(infilepath) as infile:
        data = []
        answer = []
        for line in infile:
            line = line.strip()
            if not line: continue
            if line == start or data:
                data.append(line)
            if line == end:
                temp = dict(data[1].split('-'))
                temp['ip'] = data[2].split(":")[1]
                temp['port'] = data[3].split(":")[1]
                answer.append(temp)
                data = []
    return answer

用法:

data = getData("path/to/file", "startTc:TC5", "endTc:TC5")
for d in data:
    print("Client:", d['Client'])
    print("IP:", d['ip'])
    print("Port:", d['port'])