我正试图从python 2.0的MCX网站https://www.mcxindia.com/market-data/market-watch获取不同商品的最新交易价格(LTP)。以下是我正在使用的代码。
import requests
from bs4 import BeautifulSoup
url = 'https://www.mcxindia.com/market-data/market-watch'
page = requests.get(url)
soup = BeautifulSoup(page.text, 'html.parser')
soup.findAll('div',attrs={'class':'ltp green ltpcenter'})
但是当我运行代码时,我得到的是空值。我怀疑网站会在其他Web服务器上查询这些值,因为当我查看网页的来源时,看不到那里的最后交易价格。谁能帮我如何将价格数据输入python吗?
答案 0 :(得分:1)
下面的代码获取该页面上显示的所有市场数据,从json响应中提取您想要的任何内容。
import requests
url = "https://www.mcxindia.com/backpage.aspx/GetMarketWatch"
headers = {
"Host": "www.mcxindia.com",
"Origin": "https://www.mcxindia.com",
"X-Requested-With": "XMLHttpRequest",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.81 Safari/537.36",
"Content-Type": "application/json",
"Referer": "https://www.mcxindia.com/market-data/market-watch",
"Accept": "application/json, text/javascript, */*; q=0.01",
}
resp = requests.post(url, headers = headers)
market_data = resp.json()
答案 1 :(得分:0)
您必须处理JS,可以使用硒加载JS,请参见下面的代码。
from selenium import webdriver
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait as wait
from bs4 import BeautifulSoup
driver = webdriver.Chrome()
driver.get("https://www.mcxindia.com/market-data/market-watch")
wait(driver, 10).until(EC.visibility_of_element_located(
(By.XPATH, '//*[@class="symbol chnge-perc right5"]')))
source = driver.page_source
soup = BeautifulSoup(source, 'html.parser')
soup.findAll('div',attrs={'class':'ltp green ltpcenter'})
print soup