Question

在开始之前，我可以说，我对代码中与网络的一般沟通非常陌生。话虽如此，任何人都可以协助我获取这些参数，

        'a': stMonth,
        'b': stDate,
        'c': stYear,
        'd': enMonth,
        'e': enDate,
        'f': enYear,
        'submit': 'submit'

用于＆＃34;设置日期范围＆＃34;本页面上的框， http://finance.yahoo.com/q/hp?s=gspc&a=00&b=3&c=1951&d=11&e=29&f=2014&g=d&z=66&y=0

，使用我的Python代码。它目前包括：

def getHistoricData(symbol, stMonth, stDate, stYear, enMonth, enDate, enYear):  
    url = 'http://finance.yahoo.com/q/hp?s=%s&a=00&b=3&c=1951&d=11&e=29&f=2014&g=d&z=66&y=0' % symbol    
    params = {
        'a': stMonth,
        'b': stDate,
        'c': stYear,
        'd': enMonth,
        'e': enDate,
        'f': enYear,
        'submit': 'submit',
    }  
    response = requests.get(url, params=params)  
    tree = html.document_fromstring(response.content)

symbol = raw_input("Symbol: ")
getHistoricData(symbol, '00', '11', '2010', '00', '13', '2010')

我认为参数的名称或值可能有问题，但我无法确定。在此先感谢 - 非常感谢任何和所有的帮助！（包括批评，只要它至少有点建设性！）

Answer 1

您不需要submit参数，但需要g。此处d表示daily：

def getHistoricData(symbol, stMonth, stDate, stYear, enMonth, enDate, enYear):
    url = 'http://finance.yahoo.com/q/hp'
    params = {
        's': symbol,
        'a': stMonth,
        'b': stDate,
        'c': stYear,
        'd': enMonth,
        'e': enDate,
        'f': enYear,
        'g': 'd'
    }  
    response = requests.get(url, params=params)  
    tree = html.document_fromstring(response.content)
    print tree.xpath('.//table[@class="yfnc_datamodoutline1"]//tr/td[1]/text()')

例如，如果你打电话：

getHistoricData('^GSPC', '02', '3', '1950', '10', '30', '2014')

打印以下内容（日期从第一列开始）：

[
    'Nov 28, 2014', 
    'Nov 26, 2014', 
    'Nov 25, 2014', 
    'Nov 24, 2014',
    ...
]

Answer 2

我认为你不需要使用参数。只需格式化URL即可。像这样：

# -*- coding: utf-8 -*-
#!/usr/bin/python

import requests

symbol = raw_input("Symbol: ")
params = (symbol, '00', '11', '2010', '00', '13', '2010')

url = 'http://finance.yahoo.com/q/hp?s=%s&a=%s&b=%s&c=%s&d=%s&e=%s&f=%s&g=d' % params     
response = requests.get(url)
# you will get 200 OK here 
print response
# and page info is in response.text

Answer 3

<input>属性的name元素等于：

a, b, c, d, e, f, g(the radio button Daily/Weekly/Monthly)

位于<form>标记内，其中包含此hidden form field：

<input type="hidden" name="s" value="^GSPC" data-rapid_p="11">

这将向服务器发送名称/值对，就像常规<input>元素一样。您需要在请求中包含该名称/值对，以便服务器端程序知道您要为哪个库存请求数据。

表单中的submit button也会向服务器发送一个名称/值对，但它很少有用，在这种情况下你可以省略它：

import requests

url = 'http://finance.yahoo.com/q/hp'

params = {
    's': '^GSPC', #<input type="hidden" name="s" value="^GSPC" data-rapid_p="11">
    'a': 1, #stMonth,
    'b': 16, #stDate,
    'c': 2014, #stYear,
    'd': 1, #enMonth,
    'e': 18, #enDate,
    'f': 2014, #enYear,
    'g': 'd', #daily/weekly/monthly
}  

resp = requests.get(url, params=params) 
print resp.text
print resp.url

resp.url实际上是请求发送到的网址，您可以通过打印来检查它：

http://finance.yahoo.com/q/hp?a=1&c=2014&b=16&e=18&d=1&g=d&f=2014&s=%5EGSPC

如果您将其复制到浏览器的地址栏中，您将看到结果。 resp.text是包含结果的页面的html标记。您必须知道如何搜索html以查找特定结果。要使用python搜索html，请查看：

BeautifulSoup
LXML

Python - 请求：正确使用Params？

3 个答案: