RegEX似乎不匹配,即使它应该

时间:2017-06-14 10:18:48

标签: python regex python-requests

我在尝试将字符串W0mMhUcRRnG8dcghE4qvk3JA9lGt8nDl与我的RegEX ^([A-Za-z0-9]{32})$匹配时遇到了麻烦。

根据各种在线RegEx工具,它应该匹配,但不是根据我的Python脚本:

pattern = re.compile("^([A-Za-z0-9]{32})$")
print(line)
if pattern.match(line):
    return line
else:
    return None

我尝试使用strip()检查是否有任何看不见的空格,但无法找到任何内容。

以下是整个脚本:

import requests, binascii, base64, re
from requests.auth import HTTPBasicAuth

def pattern_lookup(line):
    """
    Will iterate through lines
    to find a matching string that
    is 32 characters long and only
    holds alphanumerical characters.
    -----
    :param lines: The lines to be iterated.
    :return: The line holding the matched string,
             or None if not found
    """
    pattern = re.compile("^([A-Za-z0-9]{32})$")
    print(line)
    if pattern.match(line):
        return line
    else:
        return None

def get_secret(host, credentials):
    """
    Grabs the hint(flag) from the 
    host by splitting the response on
    semicolon (;) then performing
    pattern matching using regex.
    ----
    :param host: The host we are sending 
                 requests to.
    :param credentials: The credentials required
                        to sign into the host.
    :return: The hex encoded secret.
    """
    try:
        response = requests.get(host, auth=HTTPBasicAuth(*credentials))
        response_lines = response.content.decode('ascii').replace('"', '').split(';')
        return next((line
                 for line in response_lines
                 if pattern_lookup(line)),
                None)
    except requests.RequestException as e:
        print(e)

def prepare_payload(secret):
    decoded_secret = base64.b64decode(binascii.unhexlify(secret)[::-1])
    payload = {'secret': decoded_secret, 'submit': 'placeholder'}
    return payload

def get_password(host, credentials, secret):
    """
    Uses a post-request injected with the 
    reverse engineered secret to get access
    to the password to natas9.
    :param host: The host that holds the 
                 password.
    :param credentials: 
    :param decoded_hint: 
    :return: The password to Natas9
    """
    payload = prepare_payload(secret)
    try:
        response = requests.post(host, auth=HTTPBasicAuth(*credentials), data=payload)
        response_lines = response.content.decode('utf-8').split(' ')
        return next((line
                     for line in response_lines
                     if pattern_lookup(line.strip())),
                    None)
    except requests.RequestException as e:
        print(e)


def main():
    host = 'http://natas8.natas.labs.overthewire.org/index-source.html'
    credentials = ['natas8', 'DBfUBfqQG69KvJvJ1iAbMoIpwSNQ9bWe']
    secret = get_secret(host, credentials)
    print(get_password(host.split('index')[0], credentials, secret))

if __name__ == '__main__':
    main()

修改

我应该提到get_secret中的初始测试完美无瑕,而我之前使用此工作的所有模块都很好......

EDIT2:

输出:

<link
rel="stylesheet"
type="text/css"
href="http://natas.labs.overthewire.org/css/level.css">
<link
rel="stylesheet"
href="http://natas.labs.overthewire.org/css/jquery-ui.css"
/>
<link
rel="stylesheet"
href="http://natas.labs.overthewire.org/css/wechall.css"
/>
<script
src="http://natas.labs.overthewire.org/js/jquery-1.9.1.js"></script>
<script
src="http://natas.labs.overthewire.org/js/jquery-ui.js"></script>
<script
src=http://natas.labs.overthewire.org/js/wechall-data.js></script><script
src="http://natas.labs.overthewire.org/js/wechall.js"></script>
<script>var
wechallinfo
=
{
"level":
"natas8",
"pass":
"DBfUBfqQG69KvJvJ1iAbMoIpwSNQ9bWe"
};</script></head>
<body>
<h1>natas8</h1>
<div
id="content">

Access
granted.
The
password
for
natas9
is
W0mMhUcRRnG8dcghE4qvk3JA9lGt8nDl <-- here it is
<form
method=post>
Input
secret:
<input
name=secret><br>
<input
type=submit
name=submit>
</form>

<div
id="viewsource"><a
href="index-source.html">View
sourcecode</a></div>
</div>
</body>
</html>
None

2 个答案:

答案 0 :(得分:1)

我根据你的正则表达式制作了一个演示代码,它运行正常。

import re
line = 'W0mMhUcRRnG8dcghE4qvk3JA9lGt8nDl'
pattern = re.compile("^([A-Za-z0-9]{32})$")
print(line)
if pattern.match(line):
    print ("matched")
else:
    print ("No")

Demo

这意味着您从response_lines读取的行与正则表达式所期望的格式不同。尝试打印该行并查看缺少的内容。

编辑:编辑完成后,我看到有多行数据。使用以下内容:

pattern = re.compile("^([A-Za-z0-9]{32})$", re.MULTILINE)
if pattern.finditer(line):
    print ("matched")
else:
    print ("No")

Full Demo

答案 1 :(得分:0)

您的文字是多行的。你尝试过:

 re.compile("^([A-Za-z0-9]{32})$", re.MULTILINE)