Question

Text没有空格，所以我根本不能拆分并在字符串列表上使用索引。

我正在寻找的模式是：

check=

后面是一个数字和编码的查询字符串项（apache logfile），并且在文件的每一行上都是两次。我希望输出能够提供check=

之后的内容

例如，一行中的字符串如下所示：

11.249.222.103 - - [15/Aug/2016:13:17:56 -0600] "GET /next2/120005079807?check=37593467%2CCOB&check=37593378%2CUAP&box=match&submit=Next HTTP/1.1" 500 1633 "https://mvt.squaretwofinancial.com/newmed/?button=All&submit=Submit" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Firefox/45.0"

在这种情况下，我需要提取37593467和37593378。

Answer 1

请检查此代码。

import re

text = '''11.249.222.103 - - [15/Aug/2016:13:17:56 -0600] "GET /next2/120005079807?check=37593467%2CCOB&check=37593378%2CUAP&box=match&submit=Next HTTP/1.1" 500 1633 "https://mvt.squaretwofinancial.com/newmed/?button=All&submit=Submit" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Firefox/45.0"'''


for match in re.findall("check=(\d+)",text):
    print 'Found "%s"' % match

输出：

C:\Users\dinesh_pundkar\Desktop>python demo.py
Found "37593467"
Found "37593378"

一些网址寻求帮助：

Python正则表达式在前面的子串匹配后得到子串

1 个答案: