Question

我正在尝试使用python requests模块抓取一个网站，并且需要登录该网站以检索我想要的数据。我写了一个应该可以工作的脚本，但不是。我不确定这是因为网站登录的身份验证方法。这是我试图抓住的网站：https://ashwoodvic.compass.education/login.aspx?sessionstate=disabled

import requests
import bs4 as bs

login_url = "https://ashwoodvic.compass.education/login.aspx?sessionstate=disabled"
target_url = "https://ashwoodvic.compass.education"

login_data = { "username": "my_username", "password": "my_password"}

with requests.Session() as s:
    page = s.get(login_url)
    page_login = s.post(login_url, data = login_data)
    page = s.get(target_url)
    final_page = bs.BeautifulSoup(page.content, 'lxml')
    print(final_page.title)

这是密码框的html：

<input name="username" type="text" id="username" class="metro-input" placeholder="Username" value="">
<span id="username-error" class=""></span>
<label class="ie789Only"> Password</label>
<input name="password" type="password" id="password" class="metro-input" placeholder="Password">
<input type="submit" name="button1" value="Sign in" id="button1" class="metro-button">

该脚本不包含任何错误，但只是无法登录网站，最后返回的标题是Login。任何帮助表示赞赏。

带请求的Web Scraping Python（登录）

0 个答案: