正则表达式python解析文件

时间:2013-04-26 08:18:55

标签: python regex python-2.x

#!/usr/bin/env python
# -*- coding: utf-8 -*-
#fixed the import, just red PEP 8
import re 
###################################################
def extract(data):
    ms = re.match(r'(\S+).*mid:(\d+)' , data) # heure & mid 
    k = re.findall(r"/(\S+)", data) # source & destination  
    exp = result = re.findall(r'NVS:([\w\.]+)',data) # type S & D
    # if the table length is 1 it means that the origin is unknown
    if len(k)==1:
      return {'Heure':ms.group(1), 'mid':ms.group(2),"Origine":"Unknown","Destination":k[0],"Type S":exp[0],"Type D":exp[1]}
    # if the table length it means that there's a a source and a destination
    if len(k)==2:
      return {'Heure':ms.group(1), 'mid':ms.group(2),"Origine":k[0],"Destination":k[1],"Type S":exp[0],"Type D":exp[1]}

好吧问题是当我有像[NVS:FAXG3.1.0/+44614215421]那样的行返回None时,是否可以在第二个NVS:停止,以便它停止就像我们有一条线比如data2

data = "13:16:16.146 mta         Messages       I CC Doc O:NVS:SMTP/me@test.no R:NVS:SMTP.0/server@test.de [NVS:FAXG3.1.0/+44614215421] mid:41414"
print extract(data)

返回

>>>None

data2 = "13:16:16.146 mta         Messages       I CC Doc O:NVS:SMTP/me@test.no R:NVS:SMTP.0/server@test.de  mid:41414"
print extract(data2)

返回


>>> {'Destination': 'server@test.de', 'mid': '41414', 'Type S': 'SMTP', 'Origine': 'me@test.no', 'Type D': 'SMTP.0', 'Heure': '13:16:16.146'}

1 个答案:

答案 0 :(得分:1)

肮脏的黑客

只需替换

if len(k)==2:

if len(k)>1:

因为它在第二个字符串

中找到了超过2个匹配项