用Python抓取coinmarketcap.com(requests& BeautifulSoup)

时间:2018-03-09 11:15:24

标签: python web web-scraping beautifulsoup python-requests

我想用coinmarketcap.com制作硬币列表。 每个元素都应该是一个元组。

类似的东西:

coins = [('btc',8500,'+0.5%','+1.2%', '-1%'), ...]

我无法获得百分比: 信息在td中,如下所示:

<td class="no-wrap percent-change   text-right positive_change" data-timespan="1h" data-percentusd="0.99" data-symbol="BTC" data-sort="0.991515">0.99%</td>

如何从上面获取0.99%的价值?事实上,我需要来自td的数据百分比,但我不知道那是什么。

我的测试脚本类似于:

import requests
from bs4 import BeautifulSoup

url = 'https://coinmarketcap.com/all/views/all/'
page = requests.get(url)
soup = BeautifulSoup(page.content,'html.parser')

name = soup.find_all('a', class_='currency-name-container')
price = soup.find_all('a', class_='price')
print(name)
print(price)
#how can percentage modification for 1h, 24h, 7d?
#delta_h = soup.find_all('td', ???)

1 个答案:

答案 0 :(得分:1)

您可以循环遍历表的行以获取每种货币的数据并将其存储在元组中,然后将其添加到列表中。

r = requests.get('https://coinmarketcap.com/all/views/all/')
soup = BeautifulSoup(r.text, 'lxml')

data = []
table = soup.find('table', id='currencies-all')
for row in table.find_all('tr'):
    try:
        symbol = row.find('td', class_='text-left col-symbol').text
        price = row.find('a', class_='price').text
        time_1h = row.find('td', {'data-timespan': '1h'}).text
        time_24h = row.find('td', {'data-timespan': '24h'}).text
        time_7d = row.find('td', {'data-timespan': '7d'}).text
    except AttributeError:
        continue

    data.append((symbol, price, time_1h, time_24h, time_7d))

for item in data:
    print(item)

部分输出:

('BTC', '$8805.46', '0.88%', '-12.30%', '-19.95%')
('ETH', '$677.45', '0.98%', '-11.54%', '-21.66%')
('XRP', '$0.780113', '0.62%', '-10.63%', '-14.42%')
('BCH', '$970.70', '1.01%', '-11.33%', '-23.89%')
('LTC', '$166.70', '0.74%', '-10.06%', '-19.56%')
('NEO', '$83.55', '0.24%', '-16.29%', '-33.39%')
('XLM', '$0.286741', '1.13%', '-13.23%', '-11.84%')
('ADA', '$0.200449', '0.63%', '-16.92%', '-31.43%')
('XMR', '$256.92', '0.63%', '-19.98%', '-19.46%')

由于表中的某些货币缺少数据,因此代码会为AttributeError引发.text。要跳过这些货币,我已经使用了try-except。