我正在尝试使用BeautifulSoup解决页面上的分页提取。
我设法用
prop.first.name=firstname
prop.last.name=lastname
prop.pssw.word=password
prop.url.link=alink
会给我这个
soup.findAll('button', class_='SomeName')
我想得到一个数字列表
<button class="SomeName" data-page="2" type="button">2</button>, <button class="SomeName" data-page="3" type="button">3</button>, <button class="SomeName" data-page="4" type="button">4</button>, <button class="SomeName" data-page="5" type="button">5</button>, <button class="SomeName" data-page="6" type="button">6</button>, <button class="SomeName" data-page="7" type="button">7-12</button>
答案 0 :(得分:2)
您可以在按钮标签之间获取文本:
from bs4 import BeautifulSoup as soup
html = '<button class="SomeName" data-page="2" type="button">2</button>, <button class="SomeName" data-page="3" type="button">3</button>, <button class="SomeName" data-page="4" type="button">4</button>, <button class="SomeName" data-page="5" type="button">5</button>, <button class="SomeName" data-page="6" type="button">6</button>, <button class="SomeName" data-page="7" type="button">7-12</button>'
result = [i.text for i in soup(html, 'html.parser').find_all('button')]
输出:
['2', '3', '4', '5', '6', '7-12']
答案 1 :(得分:1)
您还可以使用CSS选择器
output = [button.text for button in soup.select('button.SomeName')]