Question

在以下代码示例中，我一直尝试使用bs4 python从标签中提取一些信息

<div class"name">
   <h1 class="fullname"> John Martin </h1>

我尝试过的是使用soup.find和soup.select的这两种方法

1）

name= soup.find('h1', class_='name').get_text()
        print(name)

2）

n=soup.select(".fullname h1").get_text()
    print(n)

都给出错误

AttributeError: 'NoneType' object has no attribute 'get_text'
and other is []

我在做什么错了？

Answer 1

在第一个示例中，class属性应为fullname

在第二个示例中，CSS选择器应为h1.fullname->用<h1>选择class=fullname。另外，将方法更改为select_one()以仅选择一个元素：

from bs4 import BeautifulSoup

txt = '''<div class"name">
   <h1 class="fullname"> John Martin </h1>'''

soup = BeautifulSoup(txt, 'html.parser')

name= soup.find('h1', class_='fullname').get_text() # <-- change to fullname
print(name)

n=soup.select_one("h1.fullname").get_text() # <-- change to h1.fullname and .select_one()
print(n)

打印：

 John Martin 
 John Martin

使用bs4中的选择和查找方法无法提取值

1 个答案: