我有一个要剪贴的网站。
该网站有多个标题,我想以文本形式打印所有标题。
我使用的代码会打印标题,但不会以纯文本形式打印标题,因为我使用find_all将标题编入列表。
下面是代码-
import pandas as pd
from bs4 import BeautifulSoup
import csv
import requests
titlelist=[]
url='https://www.hematology.org/meetings/annual-meeting/programs/education-spotlight'
r=requests.get(url)
soup=BeautifulSoup(r.content,'html.parser')
content=soup.find_all('div',class_='col')
for property in content:
name=property.find_all('h2',class_='smaller')
print(name)
答案 0 :(得分:1)
只需遍历结果并使用.text
在每个标签中打印文本。将此替换为最后一个for
循环:
for property in content:
names=property.find_all('h2',class_='smaller')
for name in names:
print(name.text)
完整代码:
import pandas as pd
from bs4 import BeautifulSoup
import csv
import requests
titlelist=[]
url='https://www.hematology.org/meetings/annual-meeting/programs/education-spotlight'
r=requests.get(url)
soup=BeautifulSoup(r.content,'html.parser')
content=soup.find_all('div',class_='col')
for property in content:
names=property.find_all('h2',class_='smaller')
for name in names:
print(name.text)
输出:
Appropriate Use of Imaging in Patients with Lymphoma
Emicizumab’s Impact on the Landscape of Hemophilia A Treatment: Two Artists Debate the View
How to Manage Common Challenging Situations in Patients with Multiple Myeloma
Transfusion and Anemia in Global Health
Vascular Anomalies 101: Case-Based Discussion on the Diagnosis, Treatment and Lifelong Care of These Patients