html的部分看起来像这样,
<div id="block-hubs3d-hub-hub-specialties" class="block block-hubs3d-hub first odd">
<h3 class="block-title">Specialties</h3>
<div class="field field-name-field-hub-specialties field-type-taxonomy-term-reference field-label-hidden">
<div class="field-items">
<div class="field-item item-1 even">ABS+PLA+Nylon+Flexible</div>
<div class="field-item item-2 odd">Custom Finishing</div>
<div class="field-item item-3 even">DLP - SLA Technology</div>
<div class="field-item item-4 odd">Makerjuice G+</div>
</div>
</div>
如何将其作为格式,例如:
specialties: ABS+PLA+Nylon+Flexible, Custom Finishing, DLP - SLA Technology, DLP - SLA Technology
到目前为止,我只知道使用bs4获取所有文本:
response = requests.get('https://www.3dhubs.com/new-york/hubs/peerless')
soup = bs4.BeautifulSoup(response.text)
答案 0 :(得分:2)
按div
:
class
import bs4
h = """
<div id="block-hubs3d-hub-hub-specialties" class="block block-hubs3d-hub first odd">
<h3 class="block-title">Specialties</h3>
<div class="field field-name-field-hub-specialties field-type-taxonomy-term-reference field-label-hidden">
<div class="field-items">
<div class="field-item item-1 even">ABS+PLA+Nylon+Flexible</div>
<div class="field-item item-2 odd">Custom Finishing</div>
<div class="field-item item-3 even">DLP - SLA Technology</div>
<div class="field-item item-4 odd">Makerjuice G+</div>
</div>
</div>
"""
b = bs4.BeautifulSoup(h)
specialties = [div.text for div in b.findAll("div", {"class":"field-item"})]
print(", ".join(b))
输出:
ABS+PLA+Nylon+Flexible, Custom Finishing, DLP - SLA Technology, Makerjuice G+