从web刮空的价值与python美丽的汤

时间:2018-03-09 10:43:35

标签: python html web-scraping beautifulsoup python-requests

我正在尝试废弃此网站,但在提取正确值时出现问题。该网站的价格为银色,黄金,钯金和铂金。 http://www.lbma.org.uk/precious-metal-prices 网站的html如下。

      <div id="header-tabs-content" data-tabs-content="header-tabs">
        <div class="tabs-panel is-active" id="header-tabs-panel1" 
     role="tabpanel" aria-hidden="false" aria-labelledby="header-tabs-
     panel1-label">
          <a href="/precious-metal-prices">
          <p>Gold Price</p>
          <p>AM: 
              <strong>$
              <span id="daily_gold_am_usd">1325.40</span>
              </strong> <br>
            <em class="update">Updated: <span 
          id="daily_gold_am_timestamp">08/03 10:31:00</span></em> </p>
          <p>PM: 
              <strong>$
              <span id="daily_gold_pm_usd">1321.00</span>
              </strong> <br>
            <em class="update">Updated: <span 
          id="daily_gold_pm_timestamp">08/03 15:02:00</span></em> </p>
            </a>

我有兴趣从下面的html数据结构中获取1325.40的daily_gold_am_usd和1321.00的daily_gold_pm_usd。但是,我从过去的帖子咨询后尝试的代码似乎无法返回这些值。

#Import packages

import pandas as pd
import numpy as np
import requests
from bs4 import BeautifulSoup

#define url and get html

url = "http://www.lbma.org.uk/precious-metal-prices"
r=requests.get(url)
data=r.text
soup = BeautifulSoup(data,"html.parser")

#Find the object of interest

gold_am_price = soup.find("span", {"id": "daily_gold_am_usd"})
Au_price_am = gold_am_price.text.strip()

gold_pm_price = soup.find("span", {"id": "daily_gold_pm_usd"})
Au_price_pm = gold_pm_price.text.strip()

感谢任何帮助。多谢你们。

1 个答案:

答案 0 :(得分:1)

这些值来自XHR到http://lbma.oblive.co.uk/api/today/both.json,因此您可以将它们视为:

import requests
url = "http://lbma.oblive.co.uk/api/today/both.json"
response = requests.get(url).json()

print(response)的输出:

{'gold': {'am': {'usd': '1325.40', 'gbp': '955.080', 'eur': '1070.390', 'timesta
mp': '08/03 10:31:00'}, 'pm': {'usd': '1321.00', 'gbp': '953.370', 'eur': '1069.
750', 'timestamp': '08/03 15:02:00'}}, 'silver': {'usd': '16.48000', 'usdc': '16
48', 'gbp': '11.89000', 'gbpp': '1189', 'eur': '13.31000', 'eurc': '1331', 'time
stamp': '08/03 12:01:00'}, 'platinum': {'am': {'usd': '949.00', 'gbp': '683.960'
, 'eur': '766.250', 'timestamp': '08/03 09:49:00'}, 'pm': {'usd': '954.00', 'gbp
': '687.570', 'eur': '769.670', 'timestamp': '08/03 14:09:00'}}, 'palladium': {'
am': {'usd': '970.00', 'gbp': '699.100', 'eur': '783.210', 'timestamp': '08/03 0
9:49:00'}, 'pm': {'usd': '985.00', 'gbp': '709.910', 'eur': '794.680', 'timestam
p': '08/03 14:09:00'}}}

然后你可以提取所需的:

response['gold']['am']['usd']  #  1325.40
response['gold']['pm']['usd']  #  1321.00