嗨,我想通过使用硒从网站上的表格中获取内容
这是我的第一次尝试:
from selenium import webdriver
driver = webdriver.Chrome()
driver.get('https://www.javatpoint.com/html-table')
texts = driver.find_elements_by_xpath("//td/table")
print(texts)
然后我想通过使用将其更改为textcontent
get_property('textContent')
错误说AttributeError:'NoneType'对象没有属性'get_property'
所以我想知道如何获取表内容并将其转换为数组
ps。 python版本3.7.4
答案 0 :(得分:0)
如果要查找表中的所有文本,请使用.text
。像这样:
from pprint import pprint
from selenium import webdriver
driver = webdriver.Chrome()
driver.get('https://www.javatpoint.com/html-table')
texts = [i.text for i in driver.find_elements_by_xpath("//td/table")]
pprint(texts)
输出:
['Tag Description\n'
'<table> It defines a table.\n'
'<tr> It defines a row in a table.\n'
'<th> It defines a header cell in a table.\n'
'<td> It defines a cell in a table.\n'
'<caption> It defines the table caption.\n'
'<colgroup> It specifies a group of one or more columns in a table for '
'formatting.\n'
'<col> It is used with <colgroup> element to specify column properties for '
'each column.\n'
'<tbody> It is used to group the body content in a table.\n'
'<thead> It is used to group the header content in a table.\n'
'<tfooter> It is used to group the footer content in a table.',
'First_Name Last_Name Marks\n'
'Sonoo Jaiswal 60\n'
'James William 80\n'
'Swati Sironi 82\n'
'Chetna Singh 72',
'First_Name Last_Name Marks\n'
'Sonoo Jaiswal 60\n'
'James William 80\n'
'Swati Sironi 82\n'
'Chetna Singh 72',
'Name Last Name Marks\n'
'Sonoo Jaiswal 60\n'
'James William 80\n'
'Swati Sironi 82\n'
'Chetna Singh 72',
'Name Last Name Marks\n'
'Sonoo Jaiswal 60\n'
'James William 80\n'
'Swati Sironi 82\n'
'Chetna Singh 72',
'Name Mobile No.\nAjeet Maurya 7503520801 9555879135',
'Name Ajeet Maurya\nMobile No. 7503520801\n9555879135',
'Element Chrome IE Firefox Opera Safari\n<table> Yes Yes Yes Yes Yes']
如果您想使用get_property('textContent')
,可以执行以下操作:
from pprint import pprint
from selenium import webdriver
driver = webdriver.Chrome()
driver.get('https://www.javatpoint.com/html-table')
texts = [i.get_property('textContent') for i in driver.find_elements_by_xpath("//td/table")]
pprint(texts)
输出:
['\n'
'TagDescription\n'
'<table>It defines a table.\n'
'<tr>It defines a row in a table.\n'
'<th>It defines a header cell in a table.\n'
'<td>It defines a cell in a table.\n'
'<caption>It defines the table caption.\n'
'<colgroup>It specifies a group of one or more columns in a table for '
'formatting.\n'
'<col>It is used with <colgroup> element to specify column properties for '
'each\n'
'column.\n'
'<tbody>It is used to group the body content in a table.\n'
'<thead>It is used to group the header content in a table.\n'
'<tfooter>It is used to group the footer content in a table.\n',
'\n'
'First_NameLast_NameMarks\n'
'SonooJaiswal60\n'
'JamesWilliam80\n'
'SwatiSironi82\n'
'ChetnaSingh72\n',
'\n'
'First_NameLast_NameMarks\n'
'SonooJaiswal60\n'
'JamesWilliam80\n'
'SwatiSironi82\n'
'ChetnaSingh72\n',
'\n'
' \n'
' Name\n'
' Last Name\n'
' Marks\n'
' \n'
' SonooJaiswal60\n'
'JamesWilliam80\n'
'SwatiSironi82\n'
'ChetnaSingh72\n',
'\n'
' \n'
' Name\n'
' Last Name\n'
' Marks\n'
' \n'
' SonooJaiswal60\n'
'JamesWilliam80\n'
'SwatiSironi82\n'
'ChetnaSingh72\n',
'\n'
' \n'
' Name\n'
' Mobile No.\n'
' \n'
' \n'
' Ajeet Maurya\n'
' 7503520801\n'
' 9555879135\n'
' \n',
' \nNameAjeet Maurya \nMobile No.7503520801 \n9555879135 \n',
'\nElement Chrome IE Firefox Opera Safari\n<table>YesYesYesYesYes\n']
答案 1 :(得分:0)
这会将表格单元格内容存储到数组中
from selenium import webdriver
from selenium.webdriver.common.by import By
driver = webdriver.Chrome('/usr/local/bin/chromedriver') # Optional argument, if not specified will search path.
driver.implicitly_wait(15)
driver.get("https://www.javatpoint.com/html-table");
table_rows= driver.find_elements(By.XPATH,"(//p[contains(text(),'HTML table')]/following-sibling::table[@class='alt'])[1]//tr/td")
x=[]
for rows in table_rows:
#print rows.text
x.append( rows.text)
print x
driver.quit()
输出
<table>
It defines a table.
<tr>
It defines a row in a table.
<th>
It defines a header cell in a table.
<td>
It defines a cell in a table.
<caption>
It defines the table caption.
<colgroup>
It specifies a group of one or more columns in a table for formatting.
<col>
It is used with <colgroup> element to specify column properties for each column.
<tbody>
It is used to group the body content in a table.
<thead>
It is used to group the header content in a table.
<tfooter>
It is used to group the footer content in a table.