Python,Scrapy,Selenium-填写表格

时间:2019-05-31 17:40:45

标签: python selenium scrapy

我有一个页面,单击文本“ Neu hier?Jetzt registrieren”。然后将出现一个文本框和一个按钮。如何消除错误?

他没有找到文本框或按钮。

HTML部分:

<form method="post" action="./index.aspx" onsubmit="javascript:return WebForm_OnSubmit();" id="aspnetForm">
<input type="text" class="email w_200" name="Email" value="" autocomplete="on" tabindex="1" />
<input type="button" class="submit bu" style="width: 197px" tabindex="4" value="Registrieren"/>
</form>

以下代码的结果:

start!ready!you are here!

来源:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import scrapy
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
from scrapy.http import Request
#https://github.com/mozilla/geckodriver/releases
class northshoreSpider(scrapy.Spider):
    name = 'xxx'
    allowed_domains = ['mypage.de']
    start_urls = ['http://mypage.de']

    def __init__(self, category='', **kwargs):
        print('start!')
        options = Options()
        options.add_argument('--no-sandbox')
        options.add_argument('--disable-dev-shm-usage')
        options.add_argument('--headless')
        options.add_argument('--disable-gpu')
        super().__init__(**kwargs)

        self.driver = webdriver.Chrome(executable_path='/home/chef/Desktop/chromedriver', chrome_options=options)

    def parse(self,response):
        self.driver.get('http://mypage.de')
        try:
            next = self.driver.find_elements_by_xpath("//*[contains(text(), 'Neu hier? Jetzt registrieren')]")[0]
            url = 'http://mypage.de'
            yield Request(url,callback=self.parse2)
            next.click()
        except Exception as e:
            print('error2!' +  str(e))
        self.driver.close()
        print('ready!')

    def parse2(self,response):
        name="Email"
        print('you are here!')
        formdata = {'name': 'mailfrom@gmx.de'}
        yield FormRequest.from_response(response,
                                formnumber=0,
                                formdata=formdata,
                                clickdata={'value': 'Registrieren'},
                                callback=self.parse3)

    def parse3(self,response):
        print("doneTestRegistration")

1 个答案:

答案 0 :(得分:0)

我在https://docs.scrapy.org/en/latest/topics/request-response.html#formrequest-objects处查看文档,注意到您正在formnumber函数内部传递parse2。如果这是页面上的第一个表单,则应为formnumber=0,否则您可以将其保留为默认值(零)。