点击后看起来相似的截图

时间:2018-02-01 04:16:07

标签: python html selenium web-scraping phantomjs

我正在尝试使用Selenium截取this page的屏幕截图。但由于Chrome和Firefox不允许全页截图,我使用的是PhantomJS。

有两组持续时间:12 MonthMonth-to-Month。因此,我试图点击每个标签并截取屏幕截图。

获取页面内容的代码是:

browser = webdriver.PhantomJS()
browser.set_window_size(1366, 728)
browser.get("http://www.optus.com.au/shop/broadband/mobile-broadband/data-sim-card")
delay = 30 # seconds
try:
    wait = WebDriverWait(browser, delay)
    wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div.price")))
    print("\nPage is ready!")
except TimeoutException:
    print("Loading took too much time!")

html = browser.page_source
soup = BeautifulSoup(html, "html.parser")

获取持续时间的css类名:

durations = soup.body.find("ul", attrs={'class': 'filter-options grouped'})
duration_filtrs = {}
for content in durations.contents:
    duration = content.text  # Storage of the model 64GB, 256GB, 512GB
    css_clss = list(filter(lambda x: x not in ['', 'active'], content.attrs['class']))
    filtr_nm = '.' + '.'.join(css_clss)
    duration_filtrs[duration] = filtr_nm

print(duration_filtrs) 
# {'12 Months': '.filter-option.contract_length_12', 'Month to Month':'.filter-option.contract_length_1'}

截取每个持续时间标签的屏幕截图

for duration, css_cls in duration_filtrs.items():
    browser.find_element_by_css_selector(css_cls).click()
    browser.save_screenshot(duration+'.png')

使用上面的代码,即使文件大小略有不同,我也会看到类似的截图。

有人可以告诉我我做错了吗?

1 个答案:

答案 0 :(得分:2)

我不知道如何解决PhantomJS中的问题。我推荐使用Chrome无头的解决方法,如下所示。您只需指定窗口大小。

from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support import expected_conditions as EC

chrome_options = Options()
chrome_options.add_argument("--headless")
chrome_options.add_argument("--window-size=2160,3840") # you can adjust the size as you want
browser = webdriver.Chrome(chrome_options=chrome_options)
...
...
for duration, css_cls in duration_filtrs.items():
    button = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR,css_cls)))
    browser.save_screenshot('before-'+duration+'.png')
    print(button.text)
    button.click()
    time.sleep(8)
    browser.save_screenshot(duration+'.png')