我正在尝试从网址请求数据,但不断获取404 error
session = requests.Session()
GET = session.get("http://www.inmet.gov.br/sonabra/pg_dspDadosCodigo_sim.php?QTgwNA==", timeout=1)
GET.raise_for_status()
soup = bs4.BeautifulSoup(GET.text, 'html.parser')
imgNumber = soup.img.get("src").split("imgNum=", maxsplit=1)[1]
decodeNumber = str(base64.b64decode(imgNumber), 'utf-8')
request = {"dtaini": self.start, "aleaValue": imgNumber,
"aleaNum": decodeNumber, "dtafim": self.end}
POST = session.post(stationURL, data=request)
POST.raise_for_status()
记录结果:
urllib3.connectionpool: DEBUG: Starting new HTTP connection (1): www.inmet.gov.br
urllib3.connectionpool: DEBUG: http://www.inmet.gov.br:80 "GET /sonabra/pg_dspDadosCodigo_sim.php?QTgwNA== HTTP/1.1" 200 690
urllib3.connectionpool: DEBUG: /sonabra/pg_dspDadosCodigo_sim.php?QTgwNA== HTTP/1.1" 302 498
urllib3.connectionpool: DEBUG: http://www.inmet.gov.br:80 "GET /sonabra/log2/index.php HTTP/1.1" 404 302
我不确定发生了什么,因为它曾在几个月前工作,请求在浏览器上运行。我感谢任何帮助或建议。
答案 0 :(得分:0)
它与python无关我甚至通过使用网络浏览器获得404尝试不同的URL或联系服务器所有者,如果你认为这不应该发生。
答案 1 :(得分:0)
由于某些原因,我在调试中显示混乱,在Chrome和Firefox中显示旧表单,因此表单数据界面改为:
request = {"dtaini": self.start, "aleaValue": imgNumber,
"aleaNum": decodeNumber, "dtafim": self.end}
为:
request = {"aleaValue": aleaValue, "xaleaValue": xaleaValue,
"aleaNum": aleaNum, "xID": xID,
"dtaini": self.start, "dtafim": self.end}
我对有类似问题的人的建议是休息一下,第二天早上用咖啡看东西更好......