如何在此代码中仅使用正确的关键字打印链接?

时间:2016-12-16 00:27:08

标签: python regex web-scraping beautifulsoup lxml

这是我到目前为止所拥有的,现在我只需要输出只是与“lion-double-ring”的链接,因为现在空闲打印整页信息。这应该循环直到找到链接并打印带有给定关键字的链接。也许正则表达式是去这里的方式? 链路

from bs4 import BeautifulSoup
import requests

r = requests.get('walmart.com)
soup = BeautifulSoup(r.text, 'html')
links = soup.find_all('loc')
if "lion" and "double" in str(links):
print str(links)
else:
print('nothing')

1 个答案:

答案 0 :(得分:2)

from bs4 import BeautifulSoup
import requests

r = requests.get('http://shop.exclucitylife.com/sitemap_products_1.xml?from=1331122689&to=8543902145')
soup = BeautifulSoup(r.text, 'lxml')
links = soup.find_all('loc')
for link in links:
    if 'lion-double-ring' in link.text:
        print(link.text)
        break
else:
    print('nothing')