查找与正则表达式匹配的所有行并获取字符串的一部分

时间:2013-08-30 15:39:19

标签: python regex

f = open("machinelist.txt", 'r')
lines = f.readlines()
for host in lines:
       hostnames = host.strip()
       print hostnames

返回:

\\TESTHOSTDEV01
\\TESTHOSTDEVDB01
\\TESTHOSTDEVDBQA
\\TESTHOSTDEVQA02
\\BTLCMOODY01          MRA Server
\\BTLCSTG05            StG Server
\\BTLCWEB02
\\BTLCWSUS01           Test Update Server
\\HIMSAPP01
\\SLVAPP01
\\TORAAPP01
\\HNSVAPP01
\\TESAPP01

我很好奇是否有办法使用re.findall()来获取以“\”开头的所有行但是我只想捕获返回主机名,而不是“\或主机之后的注释”等“MRA Server”(例如:BTLCMOODY01)

3 个答案:

答案 0 :(得分:3)

你可以做这样的事情(不需要正则表达式):

使用str.startswith检查一行是否以'\\'开头:

>>> strs = "\\BTLCMOODY01          MRA Server\n"
>>> strs.startswith('\\')
True

然后使用str.splitstr.lstrip的组合来获得第一个单词:

>>> strs.split(None, 1)
['\\BTLCMOODY01', 'MRA Server\n']  
#apply str.lstrip on the first item
>>> strs.split(None, 1)[0].lstrip('\\')
'BTLCMOODY01'

<强>代码:

>>> with open('abc1') as f:
...     for line in f:
...         if line.startswith('\\'):     #check if the line startswith `\`
...             print line.split(None,1)[0].lstrip('\\') 
...             
TESTHOSTDEV01
TESTHOSTDEVDB01
TESTHOSTDEVDBQA
TESTHOSTDEVQA02
BTLCMOODY01
BTLCSTG05
BTLCWEB02
BTLCWSUS01
HIMSAPP01
SLVAPP01
TORAAPP01
HNSVAPP01
TESAPP01

答案 1 :(得分:2)

使用正则表达式的方法:

import re

f = open("machinelist.txt", 'r')
lines = f.readlines()
for host in lines:
    hostnames = host.strip()
    if hostnames.startswith('\\'):
        print(re.match(r'\\\\(\S+)',hostnames).group(1))

它产生:

TESTHOSTDEV01
TESTHOSTDEVDB01
TESTHOSTDEVDBQA
TESTHOSTDEVQA02
BTLCMOODY01
BTLCSTG05
BTLCWEB02
BTLCWSUS01
HIMSAPP01
SLVAPP01
TORAAPP01
HNSVAPP01
TESAPP01

答案 2 :(得分:0)

import re

pattern = re.compile(r"\\([a-z]+)[\s]+",re.I) # single-slash, foll'd by word: \HOSTNAME  

fh = open("file.txt","r")

for x in fh:
    match = re.search(pattern,x)
    if(match): print(match.group(1))