简而言之,我有兴趣从this网站下载搜索结果,但我遇到了要求输入的弹出窗口。如何使用Python 3(没有Machanize)提交该表单,以便我可以访问搜索结果?
......有这样的形式:
<div id="forcedVehicleQuestions" class="forcedUserInput" style="display: block; position: absolute; left: 50%; top: 40px; z-index: 6000; margin-left: -199px; margin-top: 0px;">
<div class="forcedContents clearfix">
<a class="btn-remove" onclick="closeForced('Search','question');">
<svg><use xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="#shape-remove"></use></svg>
</a>
<form name="forcedQuestionsForm" id="forcedQuestionsForm">
<h2 class="sans">
More Product Info Required
</h2>
<p id="questionText" class="questionText">
Brake Pads - Position
</p>
<div id="forceQuestionsRadio">
<div class="form-row">
<label class="questionRadio checkbox-radio" id="questionRadio" for="Front">
<input type="radio" id="Front" name="answer" value="10219">
Front
</label>
</div>
<div class="form-row">
<label class="questionRadio checkbox-radio" id="questionRadio" for="Rear">
<input type="radio" id="Rear" name="answer" value="10290">
Rear
</label>
</div>
<div class="form-row">
<label class="questionRadio checkbox-radio" id="questionRadio" for="Show all">
<input type="radio" id="Show all" checked="" name="answer" value="-1">
Show all
</label>
</div>
</div>
<input id="questionSubmit" type="button" class="btn btn-green btn-shadow" value="Continue" onclick="setQuestionAnswer('Brake Pads - Position',document.forms['forcedQuestionsForm'].elements['answer'],'Show all');">
<div id="forcedVehicleQuestionsLoading" class="loading load-sm">
<div class="spinner"></div>
</div>
</form>
</div>
</div>
...但是没有method=
所以POST和GET似乎不起作用。我非常确定我理解如何设置单选按钮,所以我真正感兴趣的是如何让onclick=
行为发生。
到目前为止,这是我正在使用的:
#import HTTP libraries
import requests
#import HTML parsing libraries
import bs4
url = 'http://www.oreillyauto.com/site/c/search/Brake+Pads+&+Shoes/C0068/C0009.oap?model=G6&vi=1432754&year=2006&make=Pontiac'
answerURL = 'http://www.oreillyauto.com/site/ConditionSelectServlet?answer=-1'
print("Making request")
session = requests.Session()
session.headers.update({'referer': url})
r = session.get(answerURL)
print(r.status_code)
oreillyList = bs4.BeautifulSoup(r.text, "lxml")
print("Writing response...")
logfile = 'C:/Users/mhurley/Portable_Python/notebooks/' + output + '.log'
with open(logfile, 'w') as file:
file.write(oreillyList.prettify())
print("...done writing "+logfile)
此代码检索页面的HTML,但没有任何搜索结果。