Question

我用Python编写试图从网站上获取汇率： xe.com/currency/converter（我无法发布另一个链接，抱歉 - 我已达到极限）我希望能够从这个文件中获得费率，例如，英镑和美元之间的转换：因此，我会搜索网址：＆＃34; http://www.xe.com/currencyconverter/convert/?Amount=1&From=GBP&To=USD＆＃34; ，然后得到印刷的价值＆＃34; 1.56371 USD＆＃34; （我写这条消息时的速率），并将该值作为int赋值给变量，如rate_usd。目前，我正在考虑使用BeautifulSoup模块和urllib.request模块，并请求URL（＆＃34; http://www.xe.com/currencyconverter/convert/?Amount=1&From=GBP&To=USD＆＃34;）并使用BeautifulSoup搜索它。目前，我正处于编码的这个阶段：

import urllib.request
import bs4 from BeautifulSoup

def rates_fetcher(url):
    html = urllib.request.urlopen(url).read()
    soup = BeautifulSoup(html)
    # code to search through soup and fetch the converted value
    # e.g. 1.56371
    # How would I extract this value?
    # I have inspected the page element and found the value I want to be in the class:
    # <td width="47%" align="left" class="rightCol">1.56371&nbsp;
    # I'm thinking about searching through the class: class="rightCol"
    # and extracting the value that way, but how?
url1 = "http://www.xe.com/currencyconverter/convert/?Amount=1&From=GBP&To=USD"
rates_fetcher(url1)

非常感谢任何帮助，并感谢任何花时间阅读本文的人。

P.S。提前抱歉，如果我有任何拼写错误，我有点＆＃39;匆忙：s

Answer 1

听起来你有正确的想法。

def rates_fetcher(url):
    html = urllib.request.urlopen(url).read()
    soup = BeautifulSoup(html)
    return [item.text for item in soup.find_all(class_='rightCol')]

应该这样做......这将返回任何标签内的文字列表＆＃39; rightCol＆＃39;。

如果您还没有通读Beautiful Soup documentation，那么您真的应该这样做。它很简单，非常实用。

Answer 2

试试pyquery。它比S汤好很多。

PS：对于urllib，请尝试Requests: Http for humans

PS2：其实我最后使用Node和jQuery / jQuery来进行html报废。

Python：读取网页并从该页面中提取文本

2 个答案: