从Yahoo!下载数据财经

时间:2018-04-14 16:27:59

标签: python pandas beautifulsoup yahoo-finance

您好,我最近开始了一个关于比特币分析的项目,需要从Yahoo!下载财务数据。通过Python财务。我尝试了fix_yahoo_finance和pandas datareader,但下载文件时网站上似乎有一个错误。它总是错过了几天。所以我决定用漂亮的汤,代码如下:

import requests
import time
import pandas as pd
from bs4 import BeautifulSoup

def time_convert(dt):
    time.strptime(dt,'%Y-%m-%d %H:%M:%S')
    s = time.mktime(time.strptime(dt,'%Y-%m-%d %H:%M:%S'))
    return str(int(s))

s = requests.Session()
start = time_convert("2016-02-15 00:00:00")
end   = time_convert("2018-02-15 00:00:00")

r = s.get("https://uk.finance.yahoo.com/quote/BTC-USD/history?period1="+start+"&period2="+end+"&interval=1d&filter=history&frequency=1d"

soup = BeautifulSoup(r.text, 'lxml')
tables = soup.select('table')

df_list = []
for table in tables:
    df_list.append(pd.concat(pd.read_html(table.prettify())))
    df = pd.concat(df_list)
    df.to_excel("E:\PythonData\price_"+'.xlsx')

它可以工作,但数据不完整,因为当您的鼠标滚动到页面末尾时,网站会加载数据,但代码不会这样做。我该如何解决这个问题?

2 个答案:

答案 0 :(得分:2)

雅虎曾经有过财务API,他们已经终止了它,因为有一种解决方法。

我之前已成功使用this,您可能需要查看它。

答案 1 :(得分:0)

您是否尝试过使用Yahoo Financials?它的构建非常好,不会报废网页。它从[“ context”] [“ dispatcher”] [“ stores”]对象中散列出所需的数据。它的速度非常快,而且构建的非常好。

$ pip安装yahoofinancials

用法示例:

from yahoofinancials import YahooFinancials

tech_stocks = ['AAPL', 'MSFT', 'INTC']
bank_stocks = ['WFC', 'BAC', 'C']

yahoo_financials_tech = YahooFinancials(tech_stocks)
yahoo_financials_banks = YahooFinancials(bank_stocks)

tech_cash_flow_data_an = yahoo_financials_tech.get_financial_stmts('annual', 'cash')
bank_cash_flow_data_an = yahoo_financials_banks.get_financial_stmts('annual', 'cash')

banks_net_ebit = yahoo_financials_banks.get_ebit()
tech_stock_price_data = tech_cash_flow_data.get_stock_price_data()
daily_bank_stock_prices = yahoo_financials_banks.get_historical_stock_data('2008-09-15', '2017-09-15', 'daily')

输出示例:

yahoo_financials = YahooFinancials('WFC')
print(yahoo_financials.get_historical_stock_data("2017-09-10", "2017-10-10", "monthly"))

返回

{
    "WFC": {
        "prices": [
            {
                "volume": 260271600,
                "formatted_date": "2017-09-30",
                "high": 55.77000045776367,
                "adjclose": 54.91999816894531,
                "low": 52.84000015258789,
                "date": 1506830400,
                "close": 54.91999816894531,
                "open": 55.15999984741211
            }
        ],
        "eventsData": [],
        "firstTradeDate": {
            "date": 76233600,
            "formatted_date": "1972-06-01"
        },
        "isPending": false,
        "timeZone": {
            "gmtOffset": -14400
        },
        "id": "1mo15050196001507611600"
    }
}