如何在文本中打印所有标题-Python / BeautifulSoup

时间:2020-11-03 15:06:59

标签: python html beautifulsoup

我有一个要剪贴的网站。

该网站有多个标题,我想以文本形式打印所有标题。

我使用的代码会打印标题,但不会以纯文本形式打印标题,因为我使用find_all将标题编入列表。

下面是代码-

import pandas as pd
from bs4 import BeautifulSoup
import csv
import requests


titlelist=[]
url='https://www.hematology.org/meetings/annual-meeting/programs/education-spotlight'
r=requests.get(url)
soup=BeautifulSoup(r.content,'html.parser')
content=soup.find_all('div',class_='col')
for property in content:
    name=property.find_all('h2',class_='smaller')
    print(name)

1 个答案:

答案 0 :(得分:1)

只需遍历结果并使用.text在每个标签中打印文本。将此替换为最后一个for循环:

for property in content:
    names=property.find_all('h2',class_='smaller')
    for name in names:
        print(name.text)

完整代码:

import pandas as pd
from bs4 import BeautifulSoup
import csv
import requests


titlelist=[]
url='https://www.hematology.org/meetings/annual-meeting/programs/education-spotlight'
r=requests.get(url)
soup=BeautifulSoup(r.content,'html.parser')
content=soup.find_all('div',class_='col')
for property in content:
    names=property.find_all('h2',class_='smaller')
    for name in names:
        print(name.text)

输出:

Appropriate Use of Imaging in Patients with Lymphoma
Emicizumab’s Impact on the Landscape of Hemophilia A Treatment: Two Artists Debate the View
How to Manage Common Challenging Situations in Patients with Multiple Myeloma
Transfusion and Anemia in Global Health
Vascular Anomalies 101: Case-Based Discussion on the Diagnosis, Treatment and Lifelong Care of These Patients