我无法通过执行“ python startup.py”来下载“ enron_mail_20150507.tar.gz”。我收到以下错误,不知道如何解决。
downloading the Enron dataset (this may take a while)
to check on progress, you can cd up one level, then execute <ls -lthr>
Enron dataset should be last item on the list, along with its current
size
download will complete at about 423 MB
Traceback (most recent call last):
File "startup.py", line 36, in
urllib.urlretrieve(url, filename="../enron_mail_20150507.tar.gz")
File "C:\Python27\lib\urllib.py", line 98, in urlretrieve
return opener.retrieve(url, filename, reporthook, data)
File "C:\Python27\lib\urllib.py", line 245, in retrieve
fp = self.open(url, data)
File "C:\Python27\lib\urllib.py", line 213, in open
return getattr(self, name)(url)
File "C:\Python27\lib\urllib.py", line 350, in open_http
h.endheaders(data)
File "C:\Python27\lib\httplib.py", line 1049, in endheaders
self._send_output(message_body)
File "C:\Python27\lib\httplib.py", line 893, in _send_output
self.send(msg)
File "C:\Python27\lib\httplib.py", line 855, in send
self.connect()
File "C:\Python27\lib\httplib.py", line 832, in connect
self.timeout, self.source_address)
File "C:\Python27\lib\socket.py", line 557, in create_connection
for res in getaddrinfo(host, port, 0, SOCK_STREAM):
IOError: [Errno socket error] [Errno 11001] getaddrinfo failed
我尝试将“ startup.py”中的url更改为“ http://www.cs.cmu.edu/~enron/enron_mail_20150507.tar.gz”,但是它也不起作用。如果有人在WINDOW上使用python下载了它,请告诉我如何。我真的很感激。
无论如何,我尝试手动下载它,但是即使下载了1.1 GB的文件,文件仍保持下载状态。所以,我很害怕并阻止了它……哈哈XD。 “ enron_mail_20150507.tar.gz”文件有多大?文件下载后该放在哪里?在ud120-projects中?
请帮助我。我被困住了。
答案 0 :(得分:0)
问题已解决。我是通过starup.py中的链接手动下载的, 文件大小为1.69 G(压缩)和2.23 G(解压缩)。