我正在尝试从此网站https://www.programmableweb.com/apis/directory抓取每个API和类别的名称 并以这种格式打印出来
名称:Google Maps
类别:映射
由于某种原因,我的代码仅打印第一行。
我的代码
from bs4 import BeautifulSoup as bs
import requests
url = 'https://www.programmableweb.com/apis/directory'
response = requests.get(url)
data = response.text
soup = bs(data, 'html.parser')
info = soup.find_all('table',{'class':'views-table cols-4 table'})
for i in info:
name = soup.find('td',{'class':'views-field views-field-title col-md-3'}).text
category = soup.find('td',{'class':'views-field views-field-field-article-primary-category'}).text
print('name:',name, '\nCategory:', category)
如果您可以进一步帮助我,我想做的是:
答案 0 :(得分:1)
您没有遍历表中的行。您可以找到<table>
标签的标签(只有1个,然后尝试遍历这些标签。您要做的是找到<tr>
标签中的所有<table>
标签,然后遍历<tr>
标签。您还只是从soup
对象中获取了第一个元素,而不是info
对象中的
更简单的解决方案,因为它是您所追求的<table>
标签,所以请使用pandas来抓取它(实际上是在幕后使用了beautifulsoup)。但这会为您完成所有艰苦的工作:
import pandas as pd
url = 'https://www.programmableweb.com/apis/directory'
table = pd.read_html(url)[0]
输出:
print (table.to_string())
API Name Description Category Submitted
0 Google Maps [This API is no longer available. Google Maps'... Mapping 12.05.2005
1 Twitter [This API is no longer available. It has been ... Social 12.08.2006
2 YouTube The Data API allows users to integrate their p... Video 02.08.2006
3 Flickr The Flickr API can be used to retrieve photos ... Photos 09.04.2005
4 Facebook [This API is no longer available. Its function... Social 08.16.2006
5 Amazon Product Advertising What was formerly the ECS - eCommerce Service ... eCommerce 12.02.2005
6 Twilio Twilio provides a simple hosted API and markup... Telephony 01.09.2009
7 Last.fm The Last.fm API gives users the ability to bui... Music 10.30.2005
8 Twilio SMS Twilio provides a simple hosted API and markup... Messaging 02.19.2010
9 Microsoft Bing Maps Bing Maps API and Interactive SDK features an ... Mapping 12.02.2005
10 del.icio.us From their site: del.icio.us is a social bookm... Bookmarks 10.30.2005
11 Google App Engine [This API is no longer available. Its function... Tools 12.05.2008
12 Foursquare The Foursquare Places API provides location ba... Social 09.10.2009
13 Google Homepage From their site: The Google Gadgets API provid... Widgets 12.14.2005
14 DocuSign Enterprise DocuSign is a Cloud based legally compliant eS... Electronic Signature 03.29.2008
15 Amazon S3 Since 2006 Amazon Web Services has been offeri... Storage 03.14.2006
16 Google AdSense The Google AdSense API is ideal for developers... Advertising 06.01.2006
17 GeoNames Geonames is a geographical database with web s... Reference 01.12.2006
18 Wikipedia The unofficial Wikipedia API. Because Wikipedi... Reference 09.05.2008
19 Box Box is a modern content management platform th... Content 03.07.2006
20 Amazon EC2 The Amazon Elastic Compute Cloud (Amazon EC2) ... Cloud 08.25.2006
21 Bing [The Bing API is now the Bing Web Search API. ... Search 06.04.2009
22 LinkedIn LinkedIn is the world's largest business socia... Social 12.10.2007
23 Instagram Graph Instagram is a photo sharing iPhone app and se... Photos 12.15.2010
24 Yelp Fusion The Yelp Fusion APIs are RESTful APIs and user... Recommendations 08.03.2007
如果您单击下一页,您会看到https://www.programmableweb.com/apis/directory?page=1
,因此只需在for循环中迭代直到结束,然后在每次迭代后附加到您的数据帧即可。