嗨我遇到python 3.5.2的问题 当我想获得属性的值时,我不知道问题出在哪里得到所有标签(属性+值),但我只想要标题的值? 这是我的代码
from bs4 import BeautifulSoup as bs
import requests
url = "http://bestofgeeks.com/en/"
html = requests.get(url).text
soup = bs(html,'html.parser')
tagss = soup.findAll('a',{'class':'titre_post'})
print(tagss)
我得到了这个
[<a charset="UTF-8" class="titre_post" href="article_to_read.php?category=Last-Technology&name=854&title=Apple-Watch-Series-2-Waterproof-50-meters-with-Pokemon-Go" hreflang="en" rel="tag" titre="Apple Watch Series 2 Waterproof 50 meters with Pokemon Go">
Apple Watch Series 2 Waterproof 50 meters with Pokemon Go </a>, <a charset="UTF-8" class="titre_post" href="article_to_read.php?category=Security&name=853&title=Warning-This-Cross-Platform-Malware-Can-Hack-Windows-Linux-and-OS-X-Computers" hreflang="en" rel="tag" titre="Warning This Cross Platform Malware Can Hack Windows Linux and OS X Computers">
Warning This Cross Platform Malware Can Hack Windows Linux and OS X Computers </a>, <a charset="UTF-8" class="titre_post" href="article_to_read.php?category=Games&name=852&title=PS4-Slim-Announced,-Launching-This-Month-coming-september-15-for-299$-" hreflang="en" rel="tag" titre="PS4 Slim Announced, Launching This Month coming september 15 for 299$ ">
PS4 Slim Announced, Launching This Month coming september 15 for 299$ </a>, <a charset="UTF-8" class="titre_post" href="article_to_read.php?category=Last-Technology&name=851&title=Sony-New-IFA-products" hreflang="en" rel="tag" titre="Sony New IFA products">
Sony New IFA products </a>, <a charset="UTF-8" class="titre_post" href="article_to_read.php?category=Phone&name=850&title=This-is-the-iPhone-7-waterproofing,-stereo-speakers,-and-dual-cameras" hreflang="en" rel="tag" titre="This is the iPhone 7 waterproofing, stereo speakers, and dual cameras">
This is the iPhone 7 waterproofing, stereo speakers, and dual cameras </a>, <a charset="UTF-8" class="titre_post" href="article_to_read.php?category=Security&name=849&title=Russia-is-Largest-Portal-HACKED;-Nearly-100-Million-Plaintext-Passwords-Leaked" hreflang="en" rel="tag" titre="Russia is Largest Portal HACKED; Nearly 100 Million Plaintext Passwords Leaked">
Russia is Largest Portal HACKED; Nearly 100 Million Plaintext Passwords Leaked </a>]
答案 0 :(得分:0)
如果您只想要“a”标签中的文字,因为您的所有网络链接都存储在tagss
中,只需按照以下所示进行迭代和打印:
for t in tagss:
print t.text.strip()
答案 1 :(得分:0)
如果您想要titre
属性的内容:
tagss = [tag.get('titre') for tag in soup.findAll('a',{'class':'titre_post'})]