使用python硒和Firefox或Chrome浏览器获取整个页面的截图

时间:2018-08-02 12:12:17

标签: python google-chrome selenium firefox

此帖子与此相关:

Python selenium screen capture not getting whole page

PhantomsJS的解决方案似乎正在起作用:

driver = webdriver.PhantomJS()    
driver.maximize_window()
driver.get('http://www.angelfire.com/super/badwebs/')  
scheight = .1
while scheight < 9.9:
    driver.execute_script("window.scrollTo(0, document.body.scrollHeight/%s);" % scheight)
    scheight += .01        
driver.save_screenshot('angelfire_phantomjs.png')

但是该解决方案是从2014年开始的,同时不建议使用PhantomJS。我正在收到此错误消息:

...
UserWarning: Selenium support for PhantomJS has been deprecated, please use headless versions of Chrome or Firefox instead
warnings.warn('Selenium support for PhantomJS has been deprecated, please use headless '

如果我尝试适应例如像这样的无头Firefox:

from selenium import webdriver

firefox_options = webdriver.FirefoxOptions()
firefox_options.set_headless() 
firefox_driver = webdriver.Firefox(firefox_options=firefox_options)

firefox_driver.get('http://www.angelfire.com/super/badwebs/')  
scheight = .1
while scheight < 9.9:
    firefox_driver.execute_script("window.scrollTo(0, document.body.scrollHeight/%s);" % scheight)
    scheight += .01        
firefox_driver.save_screenshot('angelfire_firefox.png')

创建了屏幕截图,但没有整个页面。

有什么主意如何使其与Firefox或Chrome浏览器兼容?

(P.S。我也发现了这篇文章:

Taking Screenshot of Full Page with Selenium Python (chromedriver)

但它似乎不是一个通用的解决方案,它要复杂得多。

1 个答案:

答案 0 :(得分:4)

这是我想出的方法,它可以对任何长度的网站进行完美的屏幕截图。它利用了无头浏览器可以在运行之前将窗口设置为任意大小这一事实,这是在运行无头浏览器之前获取滚动高度的挑战。这是唯一的缺点,该站点运行了两次。

from selenium import webdriver
from PIL import Image
from selenium.webdriver.chrome.options import Options
import time

url = 'any website url'

#run first time to get scrollHeight
driver = webdriver.Chrome()
driver.get(url)
#pause 3 second to let page load
time.sleep(3)
#get scroll Height
height = driver.execute_script("return Math.max( document.body.scrollHeight, document.body.offsetHeight, document.documentElement.clientHeight, document.documentElement.scrollHeight, document.documentElement.offsetHeight )")
print(height)
#close browser
driver.close()

#Open another headless browser with height extracted above
chrome_options = Options()
chrome_options.add_argument("--headless")
chrome_options.add_argument(f"--window-size=1920,{height}")
chrome_options.add_argument("--hide-scrollbars")
driver = webdriver.Chrome(options=chrome_options)

driver.get(url)
#pause 3 second to let page loads
time.sleep(3)
#save screenshot
driver.save_screenshot('screen_shot.png')
driver.close()