from bs4 import BeautifulSoup
import requests
req = requests.get('https://www.coingecko.com/en')
soup = BeautifulSoup(req.content, 'html.parser')
last_page = soup.select('ul.pagination li:nth-of-type(8) > a:nth-of-type(1)')[0]['href']
lp = last_page.split('=')[-1]
count = 0
for i in range(int(lp)):
count+=1
url = 'https://www.coingecko.com/en?page='+str(count)
page = requests.get(url)#requests each page one by one till last page
soup = BeautifulSoup(page.content, "html.parser")
Names = [names.text for names in soup.find_all("span", attrs={"class": "d-none d-lg-flex font-bold align-items-center justify-content-between"})]
print(Names)
Heere我只是废弃了coingecko.com中列出的所有代币名称 但是我遇到的问题是,第一个span标签包含多个span标签。为此,我同时获得两个span标签的文本,因此我不希望包含span标签的文本。 结果:
Bitcoin BTC
我只想要“ Bitcoin”或“ BTC”,或者两者都用,但要用逗号分隔。 下面的span标签详细信息:
<span class="d-none d-lg-flex font-bold align-items-center justify-content-between">Bitcoin <span class="d-none d-lg-inline font-normal text-3xs ml-2">BTC</span></span>
再次出现一个问题,我在这段代码中获得30页信息。我想将所有硬币名称保存在一个Colum中的csv文件中,标题名称将为“ COINS NAMES”。 我该如何解决这个问题? 谢谢。
答案 0 :(得分:0)
简单的方法就是使用.next
获取元素的第一个标签,然后将数据写入csv,您可以使用python内置的csv模块,也可以使用pandas。
from bs4 import BeautifulSoup
import requests, csv, os
req = requests.get('https://www.coingecko.com/en')
soup = BeautifulSoup(req.content, 'html.parser')
last_page = soup.select('ul.pagination li:nth-of-type(8) > a:nth-of-type(1)')[0]['href']
lp = last_page.split('=')[-1]
count = 0
for i in range(int(lp)):
count+=1
url = 'https://www.coingecko.com/en?page='+str(count)
page = requests.get(url)#requests each page one by one till last page
soup2 = BeautifulSoup(page.content, "html.parser")
coinnames = soup2.find_all("span", attrs={
"class": "d-none d-lg-flex font-bold align-items-center justify-content-between"})
for coinname in coinnames:
COINSNAMES = (coinname.next)
print(COINSNAMES)
#saving to csv using python's inbuilt csv module
filename = 'coingecko.csv'
file_exists = os.path.isfile(filename)
with open(filename, 'ab') as f:
fieldnames = ['COINSNAMES']
writer = csv.DictWriter(f, fieldnames=fieldnames)
if not file_exists:
writer.writeheader()
writer.writerow({'COINSNAMES':COINSNAMES})
输出:
Bitcoin
Ethereum
XRP
Bitcoin Cash
EOS
Stellar
Litecoin
Tether
Cardano
Monero
TRON
Binance Coin
IOTA
Dash
Ontology
NEO
Tezos
Ethereum Classic
NEM
VeChain
Zcash
Dogecoin
0x
Maker
OmiseGo
Bitcoin Gold
Bytecoin
OKB
Lisk
Huobi Token
Qtum
Decred
Aeternity
ICON
BitShares
Nano
DigiByte
Basic Attention Token
Zilliqa
Bitcoin Diamond
Siacoin
Steem
ZB Token
Verge
Holo
Waves
Pundi X
Metaverse ETP
True USD
Electroneum
IOStoken
Golem
Augur
Komodo
MIR COIN
Stratis
ChainLink
Populous
Status
Ardor
Ark
Wanchain
Aion
GSENetwork
aelf
KuCoin Shares
Bytom
Clubcoin
QuarkChain
Odyssey
MaidSafeCoin
Digitex Futures Exchange
Reddcoin
DigixDAO
Aurora
FunFair
GXChain
NEXO
Eternal Token
HyperCash
CyberMiles
Loopring
Dropil
Mithril
QASH
Decentraland
Dentacoin
Nebulas
Loom Network
Ravencoin
Power Ledger
PIVX
MonaCoin
Spendcoin
Elastos
Crypto.com
Horizen
Bancor Network Token
Bankera
Polymath Network