我想浏览从下拉列表中选择的网页,如下所示
<h1>Scraping Test</h1>
<form action="/tests/scraping" method='post'>
<input type="hidden" name="csrf_token" value="1499585369##d2d1570f820aec0589b3bd5f4ab4e7df913e25ff"/>
<table>
<tr>
<td>Select Ward: </td>
<td>
<select name="ward">
<option value=''>-- select --</option>
<option value='DHANLAXMICOMPLEX'>DHANLAXMICOMPLEX</option>
<option value='POTALIYA'>POTALIYA</option>
<option value='ARJUNTOWER'>ARJUNTOWER</option>
<option value='NEWCLOTHMARKET'>NEWCLOTHMARKET</option>
<option value='CHANAKYAPURI'>CHANAKYAPURI</option>
<option value='BHAIKAKANAGAR'>BHAIKAKANAGAR</option>
<option value='RADHASWAMYROAD'>RADHASWAMYROAD</option>
<option value='SATADHAR'>SATADHAR</option>
<option value='AMRUTAVIDYALAYA'>AMRUTAVIDYALAYA</option>
<option value='AGARWALTOWERS'>AGARWALTOWERS</option>
<option value='RANNAPARK'>RANNAPARK</option>
<option value='IIM'>IIM</option>
<option value='VEJALPURWARD'>VEJALPURWARD</option>
<option value='GITAMANDIR'>GITAMANDIR</option>
</select>
</td>
<td><input type="submit" value="Search" class="search"/></td>
</tr>
</table>
如何从该下拉菜单中请求网页,还有一个搜索按钮 我的代码
import requests, csv
from lxml import html
def get_all_pages():
payload = {'value':'DHANLAXMICOMPLEX'}
url = requests.get('https://recruitment.advarisk.com/tests/scraping',data=payload)
print(url.text)
答案 0 :(得分:1)
您可以从此HTML
元素
<input type="hidden" name="csrf_token" value="1499585369##d2d1570f820aec0589b3bd5f4ab4e7df913e25ff"/>
并在您的请求中使用。尝试使用以下代码,如有任何问题请告诉我
import lxml.html
import requests
url = "https://recruitment.advarisk.com/tests/scraping"
s = requests.session()
r = s.get(url)
source = lxml.html.document_fromstring(r.content)
token = source.xpath('//input[@name="csrf_token"]/@value')[0]
headers = {'Referer': 'https://recruitment.advarisk.com/tests/scraping'}
data = {'csrf_token': token, 'ward': 'DHANLAXMICOMPLEX'}
print(s.post(url, data=data, headers=headers).text)