我正在尝试填写www.wetseal.com/Stores上的表格,该表格允许选择显示商店的状态。
<form action="http://www.wetseal.com/Stores?dwcont=C73689620" method="post" id="dwfrm_storelocator_state">
<fieldset>
<div class="form-row required ">
<label for="dwfrm_storelocator_address_states_stateUSCA">
<span>State</span>
<span class="required-indicator">*</span>
</label>
<select id="dwfrm_storelocator_address_states_stateUSCA" class="input-select required" name="dwfrm_storelocator_address_states_stateUSCA">
<option value="">Select...</option>
<option value="AK">Alaska</option>
<option value="AZ">Arizona</option>
<option value="AR">Arkansas</option>
<option value="CA">California</option>
<option value="CO">Colorado</option>
<option value="CT">Connecticut</option>
<option value="DE">Delaware</option>
<option value="FL">Florida</option>
<option value="GA">Georgia</option>
<option value="HI">Hawaii</option>
<option value="ID">Idaho</option>
<option value="IL">Illinois</option>
<option value="IN">Indiana</option>
<option value="KS">Kansas</option>
<option value="KY">Kentucky</option>
<option value="MD">Maryland</option>
<option value="MA">Massachusetts</option>
<option value="MI">Michigan</option>
<option value="MN">Minnesota</option>
<option value="MS">Mississippi</option>
<option value="MO">Missouri</option>
<option value="NE">Nebraska</option>
<option value="NV">Nevada</option>
<option value="NH">New Hampshire</option>
<option value="NJ">New Jersey</option>
<option value="NM">New Mexico</option>
<option value="NY">New York</option>
<option value="NC">North Carolina</option>
<option value="ND">North Dakota</option>
<option value="OH">Ohio</option>
<option value="OK">Oklahoma</option>
<option value="OR">Oregon</option>
<option value="PA">Pennsylvania</option>
<option value="PR">Puerto Rico</option>
<option value="RI">Rhode Island</option>
<option value="SC">South Carolina</option>
<option value="SD">South Dakota</option>
<option value="TN">Tennessee</option>
<option value="TX">Texas</option>
<option value="VA">Virginia</option>
<option value="WA">Washington</option>
<option value="WV">West Virginia</option>
<option value="WI">Wisconsin</option>
</select>
</div>
<button type="submit" name="dwfrm_storelocator_findbystate" value="Search">
Search
</button>
</fieldset>
</form>
使用Chrome浏览器,我可以看到正在发出的请求和表单参数:
那就是说,我有一个非常简单的蜘蛛,看着文档,发送一个FormRequest到该URL填写表格(在这种情况下,我正在测试亚利桑那州的商店 - AZ):
class WetSealStoreSpider(Spider):
name = "wetseal_store_spider"
allowed_domains = ["wetseal.com"]
start_urls = [
"http://www.wetseal.com/Stores"
]
def parse(self, response):
yield FormRequest.from_response(response,
formname='dwfrm_storelocator_state',
formdata={'dwfrm_storelocator_address_states_stateUSCA': 'AZ',
'dwfrm_storelocator_findbystate': 'Search'},
callback=self.parse1)
def parse1(self, response):
print response.status
print response.body
当它进入FormRequest时,查看响应,一切似乎都没问题:
但是在回调方法中,我在响应中看到了这一点:
最后看起来像是一个GET请求,而且网址都错了:
'http://www.wetseal.com/Search?q=&dwfrm_storelocator_findbystate=Search&dwfrm_storelocator_address_states_stateUSCA=AZ'
知道我做错了什么吗?
谢谢!
答案 0 :(得分:1)
您正在使用formname
但该表单没有名称。
请尝试使用formxpath='id("dwfrm_storelocator_state")'
。
答案 1 :(得分:0)
试试这个
states = response.xpath(
".//select[@id='dwfrm_storelocator_address_states_stateUSCA']//option[@value!='']/@value").extract()
url = self.get_text_from_node(response.xpath("//form[@id='dwfrm_storelocator_state']/@action"))
for state in states:
form_data = {'dwfrm_storelocator_address_states_stateUSCA': state,
"dwfrm_storelocator_findbystate": "Search"}
yield FormRequest(url,
formdata=form_data,
callback=self.your_Callback)