带有数据反应性的Python BeautifulSoup查找

时间:2018-11-18 20:55:52

标签: python python-3.x beautifulsoup

我试图从Yahoo Finance刮掉一些股价,但没有得到想要的结果229.30,而是收到了“邮件”。有人知道为什么吗?

from bs4 import BeautifulSoup
import requests
import sys
from datetime import datetime, timedelta
import pandas as pd

code = input("Enter the NYSE stock symbol: ")

#Your Choice Stock
source = requests.get('https://finance.yahoo.com/quote/'+ code +'/history p='+ code).text
soup = BeautifulSoup(source, 'lxml')
price = soup.find('span', attrs={"data-reactid": "55"}
print(code + " stock: " + price.text)

此外,忽略所有其他导入,它们是我的较大文件的一部分。

编辑:现在至少给了我一个电话号码,但是这个电话号码不是我要寻找的电话号码。它给了我231.12,而不是229.30。另外,我得到的股票是好市多的。 (费用是纽约证券交易所的股票代码)

另一项编辑:由于某种原因,它读取的是数据反应堆,而不是55。我尝试使用53,它给了我55的值。为什么它看起来是2个数据-前面有反应液?

2 个答案:

答案 0 :(得分:1)

我尝试访问代码中显示的URL,然后将我重定向到https://finance.yahoo.com/lookup?s=COSTCO。我检查了表格内的元素,发现您的标签错误。将span更改为td,一切顺利

from bs4 import BeautifulSoup
import requests
import sys
from datetime import datetime, timedelta
import pandas as pd

code = "Costco" #input("Enter the NYSE stock symbol: ")

#Your Choice Stock
url = "https://finance.yahoo.com/lookup?s={}".format(code)
print(url)
source = requests.get(url).text
soup = BeautifulSoup(source, 'lxml')
symbol = soup.find('td', attrs={"data-reactid": "57"}) # 57+8*n
name = soup.find('td', attrs={"data-reactid": "58"}) # 58+8*n
price = soup.find('td', attrs={"data-reactid": "59"}) # 59+8*n
print(code + " stock: " + price.text)

print(pd.read_html(url))

输出:

https://finance.yahoo.com/lookup?s=Costco
Costco stock: 231.02
[       Symbol                             Name   ...       Type Exchange
0        COST     Costco Wholesale Corporation   ...     Stocks      NMS
1     COST.MX            COSTCO WHOLESALE CORP   ...     Stocks      MEX
2      CTO.DU         COSTCO WHOLESALE DL-,005   ...     Stocks      DUS
3      CTO.SG  COSTCO WHOLESALE CORP. Register   ...     Stocks      STU
4      CTO.MU         COSTCO WHOLESALE DL-,005   ...     Stocks      MUN
5     COST.VI            COSTCO WHOLESALE CORP   ...     Stocks      VIE
6      CTO.HM         COSTCO WHOLESALE DL-,005   ...     Stocks      HAM
7      CTO.BE         COSTCO WHOLESALE DL-,005   ...     Stocks      BER
8       CTO.F         COSTCO WHOLESALE DL-,005   ...     Stocks      FRA
9      0I47.L  COSTCO WHOLESALE CORP COSTCO WH   ...     Stocks      LSE
10  COWC34.SA                 COSTCO WHOLESALE   ...     Stocks      SAO

[11 rows x 6 columns]]

答案 1 :(得分:1)

data-reactid是动态的最新价格,更容易从表格日期历史记录第5列close*

中获取数据
price = soup.select('table td')
print(code + " stock: " + price[4].text)

如果您查看页面源,则有有趣的Json格式数据

root.App.main = {..}

解析后将其选中

price = jsonData["context"]["dispatcher"]["stores"]["QuoteSummaryStore"]["price"]["regularMarketPrice"]["raw"]