文件中的Python正则表达式

时间:2017-11-14 10:44:28

标签: python regex

我想从文件中提取一些序列后面的行。例如。一个文件包含很多行,我想按顺序排列

journey (a,b) from station south chennai to station punjab chandigarh
journey (c,d) from station jammu katra to city punjab chandigarh
journey (e) from station 

让我们说上面是代码,我想从前两行中提取以下信息:

例如,这是序列第一个词是旅程--- 然后括号将包含两个单词,---- 那么来自--- 然后它可能是word station或city --- 再一次任何字符串--- 然后再说一遍--- 然后它可能是word station或city ---

那是什么正则表达式? 注意:括号中的单词可能包含特殊字符,例如 - ,_

1 个答案:

答案 0 :(得分:0)

这将返回您想要的元素:

import re

s = '''journey (a,b) from station south chennai to station punjab chandigarh
journey (c,d) from station jammu katra to city punjab chandigarh
journey (e) from station
journey (c,d) from station ANYSTRING jammu katra to ANYSTRING city punjab chandigarh
'''

matches_single = re.findall('journey (\([^,]+,[^,]+\)) from (\S+ \S+\s{0,1}\S*) to (\S+ \S+\s{0,1}\S*)', s)
for match in matches_single:
    print(match)
matches_line = re.findall('(journey \([^,]+,[^,]+\) from \S+ \S+\s{0,1}\S* to \S+ \S+\s{0,1}\S*)', s)
for match in matches_line:
    print(match)