无法从雅虎财经抓取数据

时间:2021-04-11 00:51:30

标签: python pandas request

我正在从 Yahoo Financail 抓取数据。我搜索过链接:

https://query1.finance.yahoo.com/v7/finance/download/BVH?period1=923729900&period2=1618039708&interval=1d&events=history&includeAdjustedClose=true

def createLink(symbol,table):
    s = "https://query1.finance.yahoo.com/v7/finance/download/BVH?period1=923729900&period2=1618039708&interval=1d&events=history&includeAdjustedClose=true"
    return s.replace("BVH",symbol).replace("history",table)

def getData(symbol,table):
    URL = createLink(symbol,table)
    web = requests.get(URL)
    if web.status_code == 200:
      reader = pd.read_csv(URL)
    else:
      reader = pd.DataFrame({"Data":[],"Dividends":[],"Stock Splits":[]})
    return reader
def history(symbol):
    history_close = getData(symbol,'history')
    if history_close.empty:
      return history_close
    divend = getData(symbol,'div')
    stock = getData(symbol,'split')
    x = pd.merge(divend,stock, how="outer", on="Date")
    data = pd.merge(history_close,x, how="outer", on="Date")    
    return data
df = pd.read_excel("/content/drive/MyDrive/Colab Notebooks/symbolNYSE.xlsx")
count = 0
count_fail = 0

for i in range(0,len(df["Symbol"])):
    try:
      count += 1
      print(df["Symbol"][i],count)
      a = history(df["Symbol"][i])
      if not a.empty:
        a.to_excel("/content/drive/MyDrive/ColabNotebooks/GetCloseYahoo/"+df["Symbol"][i]+".xlsx")
    except:
      count_fail+=1
      pass


print("success:", count)
print("fail:", count_fail)

我在 Jupiter 上使用 python、request、pandas 来抓取它。 错误:

  • 标记数据时出错。 C 错误:第 3 行应为 2 个字段,看到 12
  • 关键错误

开始,我可以抓取大约 100 - 200 家公司。然后该程序将被任何 Symbol 公司出错。最后,我等一下我可以运行重复它,程序没有错误。

是什么原因?非常感谢。

0 个答案:

没有答案