Scrapy不填表

时间:2015-04-16 08:50:49

标签: html scrapy

我正在尝试使用FormRequest.from_response让Scrapy填充以下HTML表单:

  <form class="form-horizontal" method="POST" role="form">
    <div class="form-group">
        <label class="col-sm-3 control-label" for="inputEmail3"> Username </label>
        <div class="col-sm-9">
            <input class="form-control" value="" maxlength="32" name="pun" />
        </div>
    </div>
    <div class="form-group">
        <label class="col-sm-3 control-label" for="inputEmail3"> Passphrase </label>
        <div class="col-sm-9">
            <input class="form-control" type="password" value="" maxlength="10000" name="ak" />
        </div>
    </div>
</form>
</div>
<div align="right">
    <input id="send" type="submit" value="Login" name="login" />
</div>

我按照教程here进行了操作,但是那里的字段“ak”和“pun”的代码无效。任何想法或建议?谢谢。 编辑:这是我到目前为止所得到的

class TestSpider(CrawlSpider):

    name = "test1"
    allowed_domains = ['...']
    start_urls = [
        '...'  
    ]

    rules = {Rule(LinkExtractor(), callback='parse_items', follow=True),}

    def parse_items(self, response):
            return [FormRequest.from_response(response,
               formdata={"pun": '...', "ak": '...'},
               callback=self.after_login)]

    def after_login(self, link):
        # Check login succeed before going on
        if "authentication failed" in response.body:
            self.log("Login failed", level=log.ERROR)
            return
        # Crawl contents ... 

2 个答案:

答案 0 :(得分:1)

我解决了这个问题。所需要的只是写作:

formdata={"pun": '...', "ak": '...', "Login" = 'login'}

然而,我仍然怀疑其背后的原因。有人可以解释一下吗?

答案 1 :(得分:0)

submit按钮必须位于<form>标记

尝试这个

<form class="form-horizontal" method="POST" role="form">
    <div class="form-group">
        <label class="col-sm-3 control-label" for="inputEmail3"> Username </label>
        <div class="col-sm-9">
            <input class="form-control" value="" maxlength="32" name="pun" />
        </div>
    </div>
    <div class="form-group">
        <label class="col-sm-3 control-label" for="inputEmail3"> Passphrase </label>
        <div class="col-sm-9">
            <input class="form-control" type="password" value="" maxlength="10000" name="ak" />
        </div>
    </div>
    <div align="right">
        <input id="send" type="submit" value="Login" name="login" />
    </div>
</form>