Question

我正在尝试编写一个可以导航到此page的Python脚本，填写表单数据，提交表单并下载自动返回的ZIP文件。

我需要的文件是通过在表单中选择 Bhavcopy 并提供过去的有效日期来生成的。 [此处显示了示例输入和结果2

到目前为止，我尝试通过复制和修改文档中的示例代码来学习和实现使用Scrapy，webbrowser，requests，Beautifulsoup，Mechanize和Selenium的几种方法，但是还没有取得任何进展。< / p>

到目前为止，唯一可以正常运行的代码是：

import requests
from bs4 import BeautifulSoup
u = 'https://www.nseindia.com/products/content/all_daily_reports.htm'
p = requests.get(u)
soup = BeautifulSoup(p.text, 'html.parser')
for link in soup.find_all('a'):
d = requests.get(link.get('href'))
print(d)

此代码并非完整，我不知道如何继续。逻辑上我知道我应该：

抓取页面[完成]
选择表单元素[未完成]可能在Scrapy中可行
填写数据（第一个参数是常量，日期可以循环提供）[不知道如何以编程方式执行此操作]
提交表格[与上述相同]
点击 a 标签下载 href 属性中的文件[request.get（在 href 值上）应该做它

任何有关如何实现这一目标的指示都将非常感激。

该网页允许您下载名为Bhavcopy的每日报告，其中包含在国家证券交易所（印度）交易的所有股票的开盘价，最低价，最高价，收盘价，我希望积累尽可能多的历史数据

Answer 1

如果您的目标是在提供日期后下载zip文件，这将完成这项工作。

如果您检查zip文件元素，则可以根据您输入的日期看到href为/content/historical/EQUITIES/2018/FEB/cm02FEB2018bhav.csv.zip或/content/historical/EQUITIES/2017/DEC/cm06DEC2017bhav.csv.zip。

如果仔细观察链接，可以看到格式化链接非常简单。因此，您的程序会从转到网址更改，提交数据并获取zip文件以简单地格式化网址。

使用网址https://www.nseindia.com/content/historical/EQUITIES/2017/DEC/cm06DEC2017bhav.csv.zip，您可以直接下载zip文件。

import webbrowser

def download_zip(date, month, year):
    url = 'https://www.nseindia.com/content/historical/EQUITIES/{2}/{1}/cm{0}{1}{2}bhav.csv.zip'.format(date, month, year)
    webbrowser.open(url)

download_zip('02', 'FEB', '2018')

调用download_zip函数将直接下载zip文件。

您唯一需要注意的是参数的格式。日期格式为DD，月份为MMM，年份为YYYY。显然，如果url无效，该程序将抛出异常。注意并使用try / except来处理它们。

如果您不希望浏览器弹出以供下载，您可以使用requests模块下载该文件。

def download_zip(date, month, year):
    url = 'https://www.nseindia.com/content/historical/EQUITIES/{2}/{1}/cm{0}{1}{2}bhav.csv.zip'.format(date, month, year)
    r = requests.get(url)
    location = 'C:/Users/username/Downloads/'  # Change the location as per your OS and your needs.
    filename = 'cm{}{}{}bhav.csv.zip'.format(date, month, year)  # You can name this file anything you want, but with a .zip extension
    with open(location + filename, 'wb') as f:
        f.write(r.content)

如何在Python中自动填写表单数据，提交表单和下载响应ZIP文件

1 个答案: