我一直在尝试从路由器页面上抓取Web内容;这些内容基本上是从Java脚本生成的。为此,我正在使用python的dryscrape软件包。不幸的是,我没有在该类中看到任何授权选项来提供页面的用户名和密码,以便从主页提取内容。因此,作为当前python代码(此处)的输出,我一直收到一条错误消息(如下所示):
import dryscrape
from bs4 import BeautifulSoup
session = dryscrape.Session()
session.visit('http://140.158.48.122:80')
response = session.body()
soup = BeautifulSoup(response)
print(soup)
输出:
<html><head><title>401 Authorization Required</title></head>
<body><h1>HTTP Error 401: Authorization Required</h1>
This server can not verify that you are authorized for access. Either
one or both of your user ID and password are invalid for whatever
reason (entered incorrectly, user ID suspended etc.).<br/><br/>Admins
can retrieve lost credentials from http://x.x.x.x/password.cgi (where
x.x.x.x is the server's LAN IP). Requires master login and master
password.
/index.htm from this server.
</body></html>