带参数的BeautifulSoup find_all

时间:2018-08-15 06:44:42

标签: python beautifulsoup

这是我第一次接触BeautifulSoup,我不知道自己在做什么错

<table class="table sortable table-striped table-condensed r-tab-enabled">
 <thead>
    <tr class="r-tab-buttons r-only-tablet">
       <th class="r-tab-button active" data-defaultsort="disabled" data-group="1">Picks</th>
       <th class="r-tab-button" data-defaultsort="disabled" data-group="2">Bans</th>
       <th class="r-tab-button" data-defaultsort="disabled" data-group="3">Combined</th>
    </tr>

这是我正在使用的HTML页面示例以及我的代码:

r = requests.get(URL, headers=headers)
soup = bs4.BeautifulSoup(r.text, 'lxml')

table = soup.find_all(lambda tag: tag.name=='table' and tag.has_attr('class') and tag['class'] =="table sortable table-striped table-condensed r-tab-enabled")

它什么也没返回,但这行得通

table = soup.find_all(lambda tag: tag.name=='table' and tag.has_attr('class'))

那么它应该什么都不返回吗?或者如何将参数输入find_all

2 个答案:

答案 0 :(得分:1)

示例代码的问题是将tag['class']与字符串值"table sortable table-striped table-condensed r-tab-enabled"进行比较,而tag['class']是一个数组。

要修复代码,请将tag['class']与数组进行比较

table = soup.find_all(lambda tag: tag.name=='table' and tag.has_attr('class') and tag['class'] == ["table", "sortable", "table-striped", "table-condensed", "r-tab-enabled"])

或者@Jon在评论中指出,请改用选择器

table = soup.select('table.table.sortable.table-striped.table-condensed')

答案 1 :(得分:0)

为什么要执行此过程,您只能使用find_all('table', class_='classes string')并从html文件中获取所有表

text = """
    <table class="table sortable table-striped table-condensed r-tab-enabled">
 <thead>
    <tr class="r-tab-buttons r-only-tablet">
       <th class="r-tab-button active" data-defaultsort="disabled" data-group="1">Picks</th>
       <th class="r-tab-button" data-defaultsort="disabled" data-group="2">Bans</th>
       <th class="r-tab-button" data-defaultsort="disabled" data-group="3">Combined</th>
    </tr>
"""

soup = bs4.BeautifulSoup(text, 'html.parser')

for i in soup.find_all('table', class_="table sortable table-striped table-condensed r-tab-enabled"):
    print(i)

您获得了信息,可能是您的帮助!