我正在尝试使用Selenium和Python来存储表的内容。我的脚本如下:
import sys
import selenium
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
driver = webdriver.Firefox()
driver.get("http://testsite.com")
value = selenium.getTable("table_id_10")
print value
driver.close()
这会打开我感兴趣的网页,然后应该保存我想要的表格的内容。我已经看到此question中使用browser.get_table()
的语法,但该程序的开头以browser=Selenium(...)
开头,我不明白。我不确定我应该使用什么语法,因为selenium.getTable("table_id_10")
不正确。
编辑:
我包含了我正在使用的表格的html片段:
<table class="datatable" cellspacing="0" rules="all" border="1" id="table_id_10" style="width:70%;border-collapse:collapse;">
<caption>
<span class="captioninformation right"><a href="Services.aspx" class="functionlink">Return to Services</a></span>Data
</caption><tr>
<th scope="col">Read Date</th><th class="numericdataheader" scope="col">Days</th><th class="numericdataheader" scope="col">Values</th>
</tr><tr>
<td>10/15/2011</td><td class="numericdata">92</td><td class="numericdata">37</td>
</tr><tr class="alternaterows">
<td>7/15/2011</td><td class="numericdata">91</td><td class="numericdata">27</td>
</tr><tr>
<td>4/15/2011</td><td class="numericdata">90</td><td class="numericdata">25</td>
</table>
答案 0 :(得分:17)
旧的Selenium RC API包含get_table
方法:
In [14]: sel=selenium.selenium("localhost",4444,"*firefox", "http://www.google.com/webhp")
In [19]: sel.get_table?
Type: instancemethod
Base Class: <type 'instancemethod'>
String Form: <bound method selenium.get_table of <selenium.selenium.selenium object at 0xb728304c>>
Namespace: Interactive
File: /usr/local/lib/python2.7/dist-packages/selenium/selenium.py
Definition: sel.get_table(self, tableCellAddress)
Docstring:
Gets the text from a cell of a table. The cellAddress syntax
tableLocator.row.column, where row and column start at 0.
'tableCellAddress' is a cell address, e.g. "foo.1.4"
由于您使用的是较新的Webdriver(a.k.a Selenium 2)API,因此该代码不适用。
也许尝试这样的事情:
import selenium.webdriver as webdriver
import contextlib
@contextlib.contextmanager
def quitting(thing):
yield thing
thing.close()
thing.quit()
with quitting(webdriver.Firefox()) as driver:
driver.get(url)
data = []
for tr in driver.find_elements_by_xpath('//table[@id="table_id_10"]//tr'):
tds = tr.find_elements_by_tag_name('td')
if tds:
data.append([td.text for td in tds])
print(data)
# [[u'10/15/2011', u'92', u'37'], [u'7/15/2011', u'91', u'27'], [u'4/15/2011', u'90', u'25']]