用硒刮表html python

时间:2020-02-25 22:35:35

标签: html python-3.x xpath

我正在尝试使用xpath在python和selenium中对网页进行网页爬取

def scrap(self):
data=[]
for tr in driver.find_elements_by_xpath('//table[@class="table expandable"]//tr'):
        #self.tds =tr.find_elements_by_tag_name('td')
        self.tds =tr.find_elements_by_tag_name('th')
if self.tds: 
    data.append([td.text for td in self.tds])

这给了我这个错误:

TypeError: scrap() missing 1 required positional argument: 'self'

See the structure of the page here

1 个答案:

答案 0 :(得分:0)

这里允许非英语帖子吗?

无论如何,我在代码屏幕截图中显示的tr标签中没有看到任何th标签,我仅看到了th标签。所以也许尝试:

self.tds =tr.find_elements_by_tag_name('th')

您不需要自我:

def scrap():
    data=[]
    for tr in driver.find_elements_by_xpath('//table[@class="table expandable"]//tr'):
        #self.tds =tr.find_elements_by_tag_name('td')
        tds =tr.find_elements_by_tag_name('th')
    if tds: 
        data.append([td.text for td in tds])