正则表达式:在组中打印最后一个

时间:2011-11-16 20:39:44

标签: python regex parsing

我编写了一个脚本来查找某个字符串并打印出某个匹配项。这种方法非常有效,直到这些组成为不确定的长度,我只需要打印最后一个。我对于found2.group和found3.group感到好奇,如果有办法只打印最后一个结果。

f = open ("CompTime.csv","w")
for infile in glob.glob( os.path.join(dir, '*.out') ):
    file_handler = open(infile, "r")
    content = file_handler.read()
    file_handler.close()
    #Find Real Time
    found2 = re.search(' REAL TIME  *.+', content)
    rtime = found2.group(0)[1:-1]
    #Find CPU Time
    found3 = re.search(' CPU TIMES  *.+', content)
    ctime = found3.group(0)[1:-1]
    #Create and Format the results.
    tResult = str(rtime)+','+str(ctime)
    f.seek(0,2) 
    f.write(tResult+'\n')
f.close()

有没有办法做到这一点,我只是阅读了关于正则表达式的文献,但我似乎没有完成这个。

工作版:

dir = os.getcwd()
for infile in glob.glob( os.path.join(dir, '*.out') ):
    file_handler = open(infile, "r")
    content = file_handler.read()
    file_handler.close()
    rtime = re.findall(' REAL TIME  *.+', content)[-1]
    #Find CPU Time
    ctime = re.findall(' CPU TIMES  *.+', content)[-1]
    #Create and Format the results.
    tResult = str(rtime)+','+str(ctime)
    print tResult

1 个答案:

答案 0 :(得分:2)

我相信re.match只查找第一次出现,你可以使用re.findall(pattern,string)代替:

>>> re.findall('-[a-zA-Z]', 'ls -A -H -B -b .')
['-A', '-H', '-B', '-b']

然后你可以像任何其他列表一样访问它:

>>> re.findall('-[a-zA-Z]', 'ls -A -H -B -b .')[-1]
'-b'