我一直在尝试从以下link捕获数据:
我能够识别几个框架:
SCROLL_PAUSE_TIME = 2
CYCLES=2
browser = webdriver.Firefox(firefox_options=opt)
browser.get(pge)
sleep(1)
comment_button = browser.find_elements_by_class_name('Ob2kfd')
sleep(1)
comment_button[0].click()
sleep(1)
html = browser.find_element_by_tag_name('html')
frames = browser.find_elements_by_tag_name('iframe')
这会找到框架:
[<selenium.webdriver.remote.webelement.WebElement
(session="bbe62090fb83ba8774d855278b17b007", element="0.46172414237768167-
3")>,
<selenium.webdriver.remote.webelement.WebElement
(session="bbe62090fb83ba8774d855278b17b007", element="0.46172414237768167-
4")>,
<selenium.webdriver.remote.webelement.WebElement
(session="bbe62090fb83ba8774d855278b17b007",
element="0.46172414237768167-5")>,
<selenium.webdriver.remote.webelement.WebElement
(session="bbe62090fb83ba8774d855278b17b007",
element="0.46172414237768167-6")>,
<selenium.webdriver.remote.webelement.WebElement
(session="bbe62090fb83ba8774d855278b17b007",
element="0.46172414237768167-7")>,
<selenium.webdriver.remote.webelement.WebElement
(session="bbe62090fb83ba8774d855278b17b007",
element="0.46172414237768167-8")>]
现在无法使用的部分...我无法切换到具有评论的框架,我尝试了许多方法:
browser.switch_to.frame(browser.find_element_by_tag_name("iframe"))
WebDriverWait(browser,10).until(EC.frame_to_be_available_and_switch_to_it((browser.find_element_by_tag_name("iframe"))))
WebDriverWait(browser, 20).until(EC.element_to_be_clickable((browser.find_element_by_tag_name("iframe"))))
browser.switch_to.default_content()
browser.switch_to.parent_frame()
browser.switch_to.frame(frames[0])
browser.switch_to.frame(frames[1])
#etc
我也尝试使用浏览器查找帧ID,但这是我的新手。
browser.switch_to.frame("gci_91f30755d6a6b787dcc2a4062e6e9824.js")
我基本上想将评论向下滚动,但似乎陷入了错误的框架:
sleep(2)
for i in range(CYCLES):
html.send_keys(Keys.DOWN)
time.sleep(SCROLL_PAUSE_TIME)
但是什么都不起作用?
请注意,它不是重复的,我很高兴看到其他一些帖子也遇到了类似的问题,但是我确实尝试了所提到的每种方法,但是似乎没有任何效果!如果有人可以提供帮助,将不胜感激。如果您可以通过页面链接尝试尝试,则似乎无效。
答案 0 :(得分:1)
element不在任何webdriverwait
中。尝试使用127 Google reviews
和CSS选择器。在这里我尝试使用chrome并可以正常工作。向我显示文本from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions
from selenium import webdriver
driver=webdriver.Chrome()
driver.get("https://www.google.com/search?rlz=1C1GCEU_en__835__835&ei=gfW0XKjNFeXjkgW8g6PYDA&q=huawei%20stores%20italy&oq=huawei+stores+italy&gs_l=psy-ab.3..0i22i30l3.4596.6067..6522...1.0..0.99.530.7......0....1..gws-wiz.......0i71j0i20i263j0i67j0j33i160.QYVpoia0BL4&npsic=0&rflfq=1&rlha=0&rllag=45499827,9211657,4837&tbm=lcl&rldimm=17275164572016510107&lqi=ChNodWF3ZWkgc3RvcmVzIGl0YWx5IgOIAQFaDwoNaHVhd2VpIHN0b3Jlcw&ved=2ahUKEwiKrMm8g9PhAhXQ4KQKHY9KDc8QvS4wAXoECAoQHQ&rldoc=1&tbs=lrf:!2m1!1e3!2m1!1e16!3sIAE,lf:1,lf_ui:4#rlfi=hd:;si:17275164572016510107,l,ChNodWF3ZWkgc3RvcmVzIGl0YWx5IgOIAQFaDwoNaHVhd2VpIHN0b3Jlcw;mv:!1m2!1d45.5258666!2d9.274078399999999!2m2!1d45.443338999999995!2d9.1152195;tbs:lrf:!2m1!1e3!2m1!1e16!3sIAE,lf:1,lf_ui:4")
element=WebDriverWait(driver,20).until(expected_conditions.element_to_be_clickable((By.CSS_SELECTOR,'span.fl span a span')))
print(element.text)
elements=WebDriverWait(driver,20).until(expected_conditions.presence_of_all_elements_located((By.CSS_SELECTOR,'div.Jtu6Td span')))
for ele in elements:
print(ele.text)
是您要照顾的。
127 Google reviews
将打印第一张照片
More Google review
第二个循环将打印3条评论评论,以查看您必须单击Very good experience, personal are super professional and kind. I became now huawei client :)
The team is really kind and extremely prepared
Good store everything you want from Huawei.
的更多评论。我认为您会这样做。
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions
from selenium import webdriver
from selenium.common.exceptions import NoSuchElementException
driver=webdriver.Chrome()
driver.get("https://www.google.com/search?rlz=1C1GCEU_en__835__835&ei=gfW0XKjNFeXjkgW8g6PYDA&q=huawei%20stores%20italy&oq=huawei+stores+italy&gs_l=psy-ab.3..0i22i30l3.4596.6067..6522...1.0..0.99.530.7......0....1..gws-wiz.......0i71j0i20i263j0i67j0j33i160.QYVpoia0BL4&npsic=0&rflfq=1&rlha=0&rllag=45499827,9211657,4837&tbm=lcl&rldimm=17275164572016510107&lqi=ChNodWF3ZWkgc3RvcmVzIGl0YWx5IgOIAQFaDwoNaHVhd2VpIHN0b3Jlcw&ved=2ahUKEwiKrMm8g9PhAhXQ4KQKHY9KDc8QvS4wAXoECAoQHQ&rldoc=1&tbs=lrf:!2m1!1e3!2m1!1e16!3sIAE,lf:1,lf_ui:4#rlfi=hd:;si:17275164572016510107,l,ChNodWF3ZWkgc3RvcmVzIGl0YWx5IgOIAQFaDwoNaHVhd2VpIHN0b3Jlcw;mv:!1m2!1d45.5258666!2d9.274078399999999!2m2!1d45.443338999999995!2d9.1152195;tbs:lrf:!2m1!1e3!2m1!1e16!3sIAE,lf:1,lf_ui:4")
element=WebDriverWait(driver,20).until(expected_conditions.element_to_be_clickable((By.CSS_SELECTOR,'span.fl span a span')))
print(element.text)
no_of_review=int(element.text.split()[0])
print(no_of_review)
elemore=WebDriverWait(driver,20).until(expected_conditions.element_to_be_clickable((By.XPATH,'//span[text()="More Google reviews"]')))
driver.execute_script("arguments[0].click();",elemore)
all_reviews = WebDriverWait(driver, 3).until(expected_conditions.presence_of_all_elements_located((By.CSS_SELECTOR, 'div.gws-localreviews__google-review')))
while len(all_reviews) < no_of_review:
driver.execute_script('arguments[0].scrollIntoView(true);', all_reviews[-1])
WebDriverWait(driver, 1).until_not(expected_conditions.presence_of_element_located((By.CSS_SELECTOR, 'div[class$="activityIndicator"]')))
all_reviews = driver.find_elements_by_css_selector('div.gws-localreviews__google-review')
reviews = []
for review in all_reviews:
try:
full_text_element = review.find_element_by_css_selector('span.review-full-text')
reviews.append(full_text_element)
except NoSuchElementException:
full_text_element = review.find_element_by_css_selector('span[class^="r-"]')
reviews.append(full_text_element.get_attribute('textContent'))
print(reviews)
已编辑
:{{1}}