从href链接中提取CSS

时间:2018-08-18 05:52:00

标签: python beautifulsoup

这是通过传递网站的URL提取网站的所有href链接的代码。

from BeautifulSoup import BeautifulSoup
import urllib2
import re
   html_page = urllib2.urlopen("http://kteq.in/services")
   soup = BeautifulSoup(html_page)
   for link in soup.findAll('a'):
      if link.get('href')==None:
          continue
      result = re.sub(r"http\S+", "", link.get('href'))
      print result

当我运行上面的代码时,将提取该网站的href链接。我得到以下输出。

  index
  index
  #
  solutions#internet-of-things
  solutions#online-billing-and-payment-solutions
  solutions#customer-relationship-management
  solutions#enterprise-mobility
  solutions#enterprise-content-management
  solutions#artificial-intelligence
  solutions#b2b-and-b2c-web-portals
  solutions#robotics
  solutions#augement-reality-virtual-reality
  solutions#azure
  solutions#omnichannel-commerce
  solutions#document-management
  solutions#enterprise-extranets-and-intranets
  solutions#business-intelligence
  solutions#enterprise-resource-planning
  services
  clients
  contact
  #
  #
  #

  #
  #
  #
  #
  #contactform
  #
  #
  #
  #
  #
  #
  #
  #
  # 
  #
  #
  #
  #
  #
  #
  index
  services
  #
  contact
  #
  iOSDevelopmentServices
  AndroidAppDevelopment
  WindowsAppDevelopment
  HybridSoftwareSolutions
  CloudServices
  HTML5Development
  iPadAppDevelopment
  services
  services
  services
  services
  services
  services
  contact
  contact
  contact
  contact
  contact

  #
  #
  #
  #

现在,我必须从这些href链接中提取CSS。例如,我必须从我在输出中获得的'index'href链接中提取CSS。请建议我。

1 个答案:

答案 0 :(得分:0)

您可以循环浏览已收集的所有href链接,并在这些页面中获取css链接。

ListViewItemComparer

通过索引页面,我得到了以下CSS链接

输出量
  

bootstrap / bootstrap.min.css   
https://maxcdn.bootstrapcdn.com/font-awesome/4.7.0/css/font-awesome.min.css   
https://cdn.linearicons.com/free/1.0.0/icon-font.min.css   
//fonts.googleapis.com/css   
cards / card.css   
GalleryStyle / set1.css   
css / custom.css   
page-transition / css / component.css   
page-transition / css / animations.css   
https://cdnjs.cloudflare.com/ajax/libs/normalize/5.0.0/normalize.min.css   
https://cdnjs.cloudflare.com/ajax/libs/slick-
转盘/1.5.5/slick.min.css   
css / scrollpage.css   
css / changingtext.css   
css / color-slider.css