python scrapy formrequest没有给出正确的网址

时间:2016-06-15 23:04:07

标签: python scrapy

我试图从Howdens网站上删除一些地址数据。但是,要做到这一点,我需要输入一些表单数据,以便在我选择的邮政编码附近找到本地软件仓库。

起始网址为" https://www.howdens.com/about-us/contact-your-local-depot/"

,源代码是:

<form id="addressForm" _lpchecked="1">
        <label for="address">
            Enter your Postcode / Town:
            <i id="searchNearest" class="icon-search depot-search"></i>
            <input name="address" id="address" class="address-input" value="" ""="">
        </label>
        <div id="add_matches" class="noDisplay">
            <strong>Multiple results found for your input address. Please select one and search again:</strong>
            <select id="add_sel" onchange="document.getElementById('address').value=this.options[this.selectedIndex].value;"></select>
        </div>
        <p>

        </p>
    </form>

我试图使用的python代码是:

import scrapy

from scrapy.http import FormRequest, Request
from Howdens.items import HowdensItem

class howdensSpider(scrapy.Spider):
    name = "howdens"
    allowed_domains = ["www.howdens.com"]
    start_urls = [
        "https://www.howdens.com/about-us/contact-your-local-depot/",
    ]

    def parse(self, response):
        yield FormRequest.from_response(response, formxpath='//*[@id="addressForm"]', formdata={'address':'W3'}, callback=self.parse_dir_contents)

    def parse_dir_contents(self, response):
        for sel in response.xpath('//*[@id="sidebar"]'):
            item = HowdensItem()
            item['name'] = sel.xpath('./h2/text()').extract()
            item['address'] = sel.xpath('./p/text()').extract()
            yield item

到目前为止,问题都存在于FormRequest行。它会返回一个&#34; https://www.howdens.com/about-us/contact-your-local-depot/?address=W3&#34;的网址。而不是向网站提交请求以返回更详细的网址。

任何有关我出错的指导都会得到很好的接受?

0 个答案:

没有答案