我的Python无法使用URL,并且没人能弄清楚为什么?

时间:2019-10-14 19:48:38

标签: python python-3.x http url

我要做的就是从网站上刮取一些有关地震的数据。实际上,我只希望Python能够从URL提取数据。由于某些原因,即使是最简单的仅打开URL并使用'.readlines()'的代码也会遇到很多错误。似乎不了解'openurl'命令,也不了解其他任何内容。

我什至不知道该尝试什么,因为我无法解析它给我的错误。我希望,在不得不进行诸如重新下载python之类的激烈工作之前,有人会为我提供答案。

import urllib.request

def urltest():

url = "http://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_day.csv"
f = urllib.request.urlopen(url)
allLines = f.readlines()
f.close()
line = allLines[0].decode()
print(line)

这是我用来简单测试的代码。该URL转到一个网站,该网站包含一个.csv文件,python应该可以轻松获取并阅读该文件。

如果有人愿意,我实际上可以张贴此代码返回的整个错误墙。看起来至少有6种不同的颜色,但这是它吐出来的最后一行:

urllib.error.URLError: <urlopen error unknown url type: https>

3 个答案:

答案 0 :(得分:1)

浏览urllib.requests模块,它加载处理程序的集合。我们可以在urllib.request.py

中看到此代码段
if hasattr(http.client, "HTTPSConnection"):
    default_classes.append(HTTPSHandler)
skip = set()
for klass in default_classes:
    for check in handlers:
        if isinstance(check, type):
            if issubclass(check, klass):
                skip.add(klass)
        elif isinstance(check, klass):
            skip.add(klass)
for klass in skip:
    default_classes.remove(klass)

for klass in default_classes:
    opener.add_handler(klass())

因此,仅当http.client.py具有属性HTTPSConnection时,才会加载https处理程序类。如果我们查看http.client.py,则可以看到以下代码来设置此属性。

try:
    import ssl
except ImportError:
    pass
else:
    class HTTPSConnection(HTTPConnection):
        "This class allows communication via SSL."

        default_port = HTTPS_PORT

因此,仅在HTTPSConnection模块可以成功导入的情况下才创建ssl类。如果您的系统没有ssl模块,那么http.client将不会加载HTTPSConnection类,而该类又不会添加该属性,因此urllib将不会为{{ 1}}。

虽然您提供的代码在我的系统上有效。我在其之前添加了以下代码,以使我的系统无法找到https模块。

ssl

这样做,我得到的错误与您得到的

#load then remove the ssl module from the system
import sys
import ssl
del ssl
sys.modules['ssl']=None

import urllib.request


def urltest():

    url = "http://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_day.csv"
    f = urllib.request.urlopen(url)
    allLines = f.readlines()
    f.close()
    line = allLines[0].decode()
    print(line)

urltest()

所以我怀疑您已经安装了未配置ssl的python。如果您收到类似

之类的错误,则只需尝试从python命令行C:\Users\cd00119621\AppData\Local\Programs\Python\Python37\python.exe C:/Users/cd00119621/PycharmProjects/ideas/stackoverflow.py Traceback (most recent call last): File "C:/Users/cd00119621/PycharmProjects/ideas/stackoverflow.py", line 19, in <module> urltest() File "C:/Users/cd00119621/PycharmProjects/ideas/stackoverflow.py", line 13, in urltest f = urllib.request.urlopen(url) File "C:\Users\cd00119621\AppData\Local\Programs\Python\Python37\lib\urllib\request.py", line 222, in urlopen return opener.open(url, data, timeout) File "C:\Users\cd00119621\AppData\Local\Programs\Python\Python37\lib\urllib\request.py", line 531, in open response = meth(req, response) File "C:\Users\cd00119621\AppData\Local\Programs\Python\Python37\lib\urllib\request.py", line 641, in http_response 'http', request, response, code, msg, hdrs) File "C:\Users\cd00119621\AppData\Local\Programs\Python\Python37\lib\urllib\request.py", line 563, in error result = self._call_chain(*args) File "C:\Users\cd00119621\AppData\Local\Programs\Python\Python37\lib\urllib\request.py", line 503, in _call_chain result = func(*args) File "C:\Users\cd00119621\AppData\Local\Programs\Python\Python37\lib\urllib\request.py", line 755, in http_error_302 return self.parent.open(new, timeout=req.timeout) File "C:\Users\cd00119621\AppData\Local\Programs\Python\Python37\lib\urllib\request.py", line 525, in open response = self._open(req, data) File "C:\Users\cd00119621\AppData\Local\Programs\Python\Python37\lib\urllib\request.py", line 548, in _open 'unknown_open', req) File "C:\Users\cd00119621\AppData\Local\Programs\Python\Python37\lib\urllib\request.py", line 503, in _call_chain result = func(*args) File "C:\Users\cd00119621\AppData\Local\Programs\Python\Python37\lib\urllib\request.py", line 1387, in unknown_open raise URLError('unknown url type: %s' % type) urllib.error.URLError: <urlopen error unknown url type: https> 导入ssl,就能轻松验证这一点。
import ssl

这将是造成您问题的原因。您将不得不重新安装配置了ssl的python,或者以某种方式从源代码构建ssl模块

答案 1 :(得分:0)

问题似乎出在网络(dns / proxy / firewall)问题上。 https://github.com/pbugnion/gmaps/issues/245

答案 2 :(得分:-1)

您可以使用熊猫:

import pandas as pd
data = pd.read_csv('http://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_day.csv')
print (data)