我正在尝试在div
内获取html /文本。 div有一个class
的数学。
这是我使用的代码:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time
from bs4 import BeautifulSoup as soup
from bs4 import SoupStrainer
import urllib.request
from selenium.webdriver.common.action_chains import ActionChains
import getpass
ui = input('What is your IXL username?\n\n')
pi = getpass.getpass('\nWhat is your IXL password?\n\n')
driver = 'C:\\Users\\agzsc\\Desktop\\MicrosoftWebDriver.exe'
driver = webdriver.Edge(driver)
driver.get('https://www.ixl.com')
username = driver.find_element_by_id('qlusername')
password = driver.find_element_by_id('qlpassword')
submit = driver.find_element_by_id('qlsubmit')
username.send_keys(ui)
password.send_keys(pi)
ActionChains(driver).move_to_element(submit).click().perform()
for x in range(1):
time.sleep(1)
driver.execute_script('''window.open("https://www.ixl.com/math/grade-3/multiply-by-11","_blank");''')
driver.switch_to_window(driver.window_handles[1+x])
math = soup.find_all('div', attrs={"class":"math"})
print(math)
如您所见,我正在使用selenium webdriver for Microsoft Edge。我还尝试使用bs4
解析该页面,并仅使用div
数学获得class
。但是,我一直收到这个错误:
Traceback (most recent call last):
File "C:\Users\agzsc\Downloads\powerixl.py", line 41, in <module>
math = soup.find_all('div', attrs={"class":"math"})
File "C:\Users\agzsc\AppData\Local\Programs\Python\Python36-32\lib\site-packages\bs4\element.py", line 1310, in find_all
generator = self.descendants
AttributeError: 'str' object has no attribute 'descendants'
如果有人可以提供帮助,我会非常亲切。谢谢!
答案 0 :(得分:0)
您可以替换
soup.find_all('div', attrs={"class":"math"})
带
driver.find_element_by_css_selector('div.math').get_attribute('innerHTML')
如果您想要innerHTML
目标div
或
driver.find_element_by_css_selector('div.math').text
如果您只想要div