Question

Iam试图使用selenium获取youtube评论，但是我得到了错误。任何人都可以帮助我。

from selenium import webdriver
import time

driver=webdriver.Chrome()
driver.get("https://www.youtube.com/watch?v=MNltVQqJhRE")
time.sleep(10)
D=driver.find_element_by_xpath('//yt-formatted-string[@id="content-text"]')
print(D.text)

我得到了这个

raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"//yt-formatted-string[@id="content-text"]"}

它适用于这段代码。但它没有抓取所有的注释。这个代码我发现堆栈溢出。

driver=webdriver.Chrome()

driver.get('https://www.youtube.com/watch?v=iFPMz36std4')

driver.execute_script('window.scrollTo(1, 500);')

#now wait let load the comments
time.sleep(5)

driver.execute_script('window.scrollTo(1, 3000);')



comment_div=driver.find_element_by_xpath('//*[@id="contents"]')
comments=comment_div.find_elements_by_xpath('//*[@id="content-text"]')
for comment in comments:
    print(comment.text)

Answer 1

您的问题是该ID不存在（至少不适合我）。找到一个ID。如果您正在使用chrome，只需右键单击元素（检查后），然后复制xpath以获取元素的xpath。

Answer 2

浏览器需要等到文档准备就绪。一个甜蜜的时间点约4500毫秒。然后，您可以将评论部分滚动到视图中并等待它滚动约3000毫秒。在此之后，#context-text将坐在DOM中。

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait as wait

driver = webdriver.Chrome()
url="https://www.youtube.com/watch?v=MNltVQqJhRE"
driver.get(url)

wait(driver, 4500)

driver.execute_script("window.scrollTo(0, document.body.scrollHeight + 500);")
driver.implicitly_wait(3000)

content = driver.find_element_by_xpath('//yt-formatted-string[@id="content-text"]')                                          

print(content.text)

没有这样的元素：无法定位元素：在网页抓取Youtube注释时

2 个答案: