找不到Alpha的Web爬网

时间:2019-07-08 14:52:15

标签: python web-scraping

我正在开发一个脚本,该脚本使用python和anaconda虚拟环境进行网络爬网,以查找特定股票行情的alpha追随者数据。我以前曾对此提出过疑问,而我收到的答案在一定程度上有所帮助。出于某种原因,当我尝试通过更改代码名称,在列表中添加更多代码或更改代码名称来更改代码时,代码会立即失败,而其他时候它可以正常工作。我想知道是否有人对我的代码有任何建议/编辑,或者有其他获取此数据的方法。附件是我的代码和输出

代码:

import requests
tickers = [ "atvi", "goog", "aapl", "amzn", "brk.b", "brk.a", "nflx", "snap"]

with requests.Session() as s:
    for ticker in tickers:
        r = s.get('https://seekingalpha.com/memcached2/get_subscribe_data/{}?id={}'.format(ticker, ticker)).json()
        print(ticker, r['portfolio_count'])

输出: Here is the output error I receive

Other times it works, this is how it should look

1 个答案:

答案 0 :(得分:0)

如果您输入了错误的代码符号或服务器在处理请求时遇到问题,它将返回空答案,并将状态代码设置为与200不同的内容(例如403)。您需要检查该信息:

import requests
tickers = [ "xxx", "atvi", "goog", "aapl", "amzn", "brk.b", "brk.a", "nflx", "snap"]

with requests.Session() as s:
    for ticker in tickers:
        response = s.get('https://seekingalpha.com/memcached2/get_subscribe_data/{}?id={}'.format(ticker, ticker))
        if response.status_code != 200:
            print(ticker, 'ERROR!')
            continue
        r = response.json()
        print(ticker, r['portfolio_count'])

打印:

xxx ERROR!
atvi 84,194
goog 1,038,749
aapl 2,076,496
amzn 817,339
brk.b 198,362
brk.a 74,682
nflx 368,925
snap 95,903