这里我需要从URL(汇率列表)中读取XML数据,输出是字典...现在我只能获得第一种货币...尝试使用find_all却没有成功... 有人可以评论我需要放置for循环以读取所有值的地方吗?
import bs4 as bs
import urllib.request
source urllib.request.urlopen('http://www.xxxy.hr/Downloads/PBZteclist.xml').read()
soup = bs.BeautifulSoup(source,'xml')
name = soup.find('Name').text
unit = soup.find('Unit').text
buyratecache = soup.find('BuyRateCache').text
buyrateforeign = soup.find('BuyRateForeign').text
meanrate = soup.find('MeanRate').text
sellrateforeign = soup.find('SellRateForeign').text
sellratecache = soup.find('SellRateCache').text
devize = {'naziv_valute': '{}'.format(name),
'jedinica': '{}'.format(unit),
'kupovni': '{}'.format(buyratecache),
'kupovni_strani': '{}'.format(buyrateforeign),
'srednji': '{}'.format(meanrate),
'prodajni_strani': '{}'.format(sellrateforeign),
'prodajni': '{}'.format(sellratecache)}
print ("devize:",devize)
XML示例:
<ExchRates>
<ExchRate>
<Bank>Privredna banka Zagreb</Bank>
<CurrencyBase>HRK</CurrencyBase>
<Date>12.01.2019.</Date>
<Currency Code="036">
<Name>AUD</Name>
<Unit>1</Unit>
<BuyRateCache>4,485390</BuyRateCache>
<BuyRateForeign>4,530697</BuyRateForeign>
<MeanRate>4,646869</MeanRate>
<SellRateForeign>4,786275</SellRateForeign>
<SellRateCache>4,834138</SellRateCache>
</Currency>
<Currency Code="124">
<Name>CAD</Name>
<Unit>1</Unit>
<BuyRateCache>4,724225</BuyRateCache>
<BuyRateForeign>4,771944</BuyRateForeign>
<MeanRate>4,869331</MeanRate>
<SellRateForeign>4,991064</SellRateForeign>
<SellRateCache>5,040975</SellRateCache>
</Currency>
<Currency Code="203">
<Name>CZK</Name>
<Unit>1</Unit>
<BuyRateCache>0,280057</BuyRateCache>
<BuyRateForeign>0,284322</BuyRateForeign>
<MeanRate>0,290124</MeanRate>
<SellRateForeign>0,297377</SellRateForeign>
<SellRateCache>0,300351</SellRateCache>
</Currency>
...etc...
</ExchRate>
</ExchRates>
答案 0 :(得分:0)
仅遍历所有 Currency 节点(而不是soup
对象),甚至使用列表推导来构建字典列表:
soup = bs.BeautifulSoup(source, 'xml')
# ALL EXCHANGE RATE NODES
curency_nodes = soup.findAll('Currency')
# LIST OF DICTIONAIRES
devize_list = [{'naziv_valute': c.find('Name').text,
'jedinica': c.find('Unit').text,
'kupovni': c.find('BuyRateCache').text,
'kupovni_strani': c.find('BuyRateForeign').text,
'srednji': c.find('MeanRate').text,
'prodajni_strani': c.find('SellRateForeign').text,
'prodajni': c.find('SellRateCache').text
} for c in curency_nodes]
或者,由于要提取所有元素,因此请结合字典理解:
devize_list = [{n.name: n.text} for c in currency_nodes \
for n in c.children if n.name is not None ]