如何从父标记读取两个子标记的值

时间:2019-07-07 08:47:43

标签: python python-3.x beautifulsoup

是否可以创建两个变量1)“ ”的“ Information_Header”和2)<p>中嵌入的文本(不包括<span>的Information_Details)?

例如Information_Header =地点 例如。 Information_Details =肖特中心斯科茨路1号美国商会办公室#23-03 S(228208)-强生礼堂

for link in final_urls[:1]:
    webpage_response = requests.get(link)
    event = BeautifulSoup(webpage_response.content, "html.parser")
    title = event.find("h1").get_text()
    name = event.find("p", attrs={"class":"name"}).get_text()
    event_information = event.find("div", attrs={"class":"info"})
    raw_text = event_information.find_all("p")
    print(raw_text)

[<p><span class="label">Venue</span> <span class="divider">:</span> AmCham Office, 1 Scotts Rd, Shaw Centre #23-03 S(228208) - J&amp;J Auditorium</p>, <p><span class="label">Date</span> <span class="divider">:</span> July 09, 2019</p>, <p><span class="label">Time</span> <span class="divider">:</span> 11:45 AM -  1:30 PM </p>, <p><span class="label">Price</span> <span class="divider">:</span> $25.00</p>]

2 个答案:

答案 0 :(得分:0)

app.set

打印:

project-2

答案 1 :(得分:0)

您可以在选择课程next_sibling之后使用divider,因为这将使您超越:

我使用示例事件和错误处理进行显示。

import requests
from bs4 import BeautifulSoup as bs

r = requests.get('https://www.amcham.org.sg/event/8914/')
soup = bs(r.content, 'lxml')
information_header = soup.select_one('.label')
information_detail = soup.select_one('.divider')
if information_header is None:
    information_header = 'Not listed'
else:
    information_header = information_header.text
try:
    information_detail = information_detail.next_sibling
except:
    information_detail = 'Not listed'