抓取网站时遇到 401 错误

时间:2021-07-05 08:04:24

标签: python web-scraping python-requests

我正在尝试使用 python 的请求库抓取 https://www.nseindia.com/api/equity-stockIndices?index=NIFTY%20500 但无法抓取。我正在使用网络选项卡上显示的 url 和标题,但它给出了 [401] 响应。有人可以指出我的错误吗?谢谢!

In [13]: url = "https://www.nseindia.com/api/equity-stockIndices?index=NIFTY%20500"

In [14]: headers = {
    ...: "accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=
    ...: 0.9",
    ...: "accept-encoding": "gzip, deflate, br",
    ...: "accept-language": "en-US,en;q=0.9,hi;q=0.8",
    ...: "user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36 E
    ...: dg/91.0.864.64"
    ...: }

In [15]: requests.get(url, headers=headers)
Out[15]: <Response [401]>

1 个答案:

答案 0 :(得分:0)

要从该站点加载数据,请尝试:

import json
import requests

headers = {
    "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:89.0) Gecko/20100101 Firefox/89.0",
}

url = "https://www.nseindia.com/api/equity-stockIndices?index=NIFTY+50"

with requests.session() as s:
    # load cookies:
    s.get("https://www.nseindia.com", headers=headers)

    data = s.get(url, headers=headers).json()
    print(json.dumps(data, indent=4))

打印:

{
    "name": "NIFTY 50",
    "advance": {
        "declines": "12",
        "advances": "38",
        "unchanged": "0"
    },
    "timestamp": "05-Jul-2021 13:43:47",
    "data": [
        {
            "priority": 1,
            "symbol": "NIFTY 50",
            "identifier": "NIFTY 50",
            "open": 15793.4,
            "dayHigh": 15833.2,
            "dayLow": 15762.05,
            "lastPrice": 15825.1,
            "previousClose": 15722.2,
            "change": 102.89999999999964,
            "pChange": 0.65,
            "ffmc": 681557737.89,
            "yearHigh": 15915.65,
            "yearLow": 10562.9,
            "totalTradedVolume": 133736133,
            "totalTradedValue": 98836934356.07,
            "lastUpdateTime": "05-Jul-2021 13:43:47",
            "nearWKH": 0.568936864030054,
            "nearWKL": -49.817758380747726,
            "perChange365d": 48.22,
            "date365dAgo": "03-Jul-2020",
            "chart365dPath": "https://static.nseindia.com/sparklines/365d/NIFTY-50.jpg",
            "date30dAgo": "04-Jun-2021",
            "perChange30d": 0.33,
            "chart30dPath": "https://static.nseindia.com/sparklines/30d/NIFTY-50.jpg",
            "chartTodayPath": "https://static.nseindia.com/sparklines/today/NIFTY-50.jpg"
        },
        {
            "priority": 0,
            "symbol": "HINDALCO",

...