无法截取宽度为0的屏幕截图

时间:2019-07-16 15:12:24

标签: python python-3.x selenium web-scraping

我正在尝试对Bootstrap模态中的元素进行屏幕截图。经过一番努力,我终于想到了这段代码:

driver.get('https://enlinea.sunedu.gob.pe/')
driver.find_element_by_xpath('//div[contains(@class, "img_publica")]').click()

WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.ID, 'modalConstancia')))
driver.find_element_by_xpath('//div[contains(@id, "modalConstancia")]').click()
active_element = driver.switch_to.active_element
active_element.find_elements_by_id('doc')[0].send_keys(graduate.id)

# Can't take this screenshot
active_element.find_elements_by_id('captchaImg')[0].screenshot_as_png('test.png')

错误是:

Traceback (most recent call last):
  File "/home/cesar/Development/manar/venv/lib/python3.7/site-packages/rq/worker.py", line 812, in perform_job
    rv = job.perform()
  File "/home/cesar/Development/manar/venv/lib/python3.7/site-packages/rq/job.py", line 588, in perform
    self._result = self._execute()
  File "/home/cesar/Development/manar/venv/lib/python3.7/site-packages/rq/job.py", line 594, in _execute
    return self.func(*self.args, **self.kwargs)
  File "./jobs/sunedu.py", line 82, in scrap_document_number
    record = scrap_and_recognize(driver, graduate)
  File "./jobs/sunedu.py", line 33, in scrap_and_recognize
    active_element.find_elements_by_id('captchaImg')[0].screenshot_as_png('test.png')
  File "/home/cesar/Development/manar/venv/lib/python3.7/site-packages/selenium/webdriver/remote/webelement.py", line 567, in screenshot_as_png
    return base64.b64decode(self.screenshot_as_base64.encode('ascii'))
  File "/home/cesar/Development/manar/venv/lib/python3.7/site-packages/selenium/webdriver/remote/webelement.py", line 557, in screenshot_as_base64
    return self._execute(Command.ELEMENT_SCREENSHOT)['value']
  File "/home/cesar/Development/manar/venv/lib/python3.7/site-packages/selenium/webdriver/remote/webelement.py", line 633, in _execute
    return self._parent.execute(command, params)
  File "/home/cesar/Development/manar/venv/lib/python3.7/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
    self.error_handler.check_response(response)
  File "/home/cesar/Development/manar/venv/lib/python3.7/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: unknown error: unhandled inspector error: {"code":-32000,"message":"Cannot take screenshot with 0 width."}
  (Session info: chrome=75.0.3770.100)
  (Driver info: chromedriver=74.0.3729.6 (255758eccf3d244491b8a1317aa76e1ce10d57e9-refs/branch-heads/3729@{#29}),platform=Linux 4.4.0-154-generic x86_64)

经过调试后,我意识到该元素没有宽度或高度:

(Pdb) active_element.find_elements_by_id('captchaImg')[0].rect
{'height': 0, 'width': 0, 'x': 0, 'y': 0}
(Pdb) active_element.find_elements_by_id('captchaImg')[0].size
{'height': 0, 'width': 0}

我认为这是失败的原因。有办法解决这个问题吗?


这些步骤是:

  1. 点击链接:

enter image description here

  1. 等待模态并填写第一个输入:

enter image description here

  1. 尝试获取验证码图片的屏幕截图:

enter image description here

如果在浏览器中检查元素(保存验证码图像的span),我会发现它实际上是100x50:

enter image description here

1 个答案:

答案 0 :(得分:1)

好的,我弄清楚了为什么您会不断遇到Cannot take screenshot with 0 width.错误。原因是页面上有多个验证码,并且使用非特定的选择器会为您提供隐藏的验证码图像(可能在另一个模式窗口下)。因此,提高特异性应该会为您提供正确的图像。

代码如下:

from contextlib import contextmanager
from logging import getLogger

from selenium.common.exceptions import TimeoutException
from selenium.webdriver import Chrome, ChromeOptions
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait

logger = getLogger(__name__)


@contextmanager
def get_chrome() -> Chrome:
    opts = ChromeOptions()
    # opts.headless = True
    logger.debug('Running Chrome')
    driver = Chrome(options=opts)
    driver.set_window_size(1000, 600)
    logger.debug('Chrome started')
    yield driver
    driver.close()


def wait_selector_present(driver: Chrome, selector: str, timeout: int = 5):
    cond = EC.presence_of_element_located((By.CSS_SELECTOR, selector))
    try:
        WebDriverWait(driver, timeout).until(cond)
    except TimeoutException as e:
        raise ValueError(f'Cannot find {selector} after {timeout}s') from e


def wait_selector_visible(driver: Chrome, selector: str, timeout: int = 5):
    cond = EC.visibility_of_any_elements_located((By.CSS_SELECTOR, selector))
    try:
        WebDriverWait(driver, timeout).until(cond)
    except TimeoutException as e:
        raise ValueError(f'Cannot find any visible {selector} after {timeout}s') from e


if __name__ == '__main__':
    with get_chrome() as c:
        captcha_sel = '#consultaForm #captchaImg img'
        modal_sel = '[data-target="#modalConstancia"]'

        url = 'https://enlinea.sunedu.gob.pe/'
        c.get(url)

        wait_selector_present(c, modal_sel)
        modal = c.find_element_by_css_selector(modal_sel)
        modal.click()

        wait_selector_visible(c, captcha_sel)
        captcha_img = c.find_element_by_css_selector(captcha_sel)
        captcha_img.screenshot('captcha.png')

结果:

enter image description here