我试图使用这个Python模块。 https://github.com/coursera-dl/edx-dl
请原谅我的基本知识。
然后安装Anaconda 3 Windows 10:
pip install edx-dl
pip install --upgrade youtube-dl
然后去做课程:
edx-dl -u user@user.com --list-courses
edx-dl -u user@user.com COURSE_URL
然而,一旦实际开始下载,这一切都有效:
收到SSL /连接错误:HTTP Error 403: Forbidden
Fiddler表示,由于User-Agent
,它被Cloudfare所阻止我安装了Fake_UserAgent https://pypi.python.org/pypi/fake-useragent并添加了:
from fake_useragent import UserAgent #added this
def edx_get_headers():
"""
Build the Open edX headers to create future requests.
"""
logging.info('Building initial headers for future requests.')
headers = {
'User-Agent': 'edX-downloader/0.01',
'Accept': 'application/json, text/javascript, */*; q=0.01',
'Content-Type': 'application/x-www-form-urlencoded;charset=utf-8',
'Referer': EDX_HOMEPAGE,
'X-Requested-With': 'XMLHttpRequest',
'X-CSRFToken': _get_initial_token(EDX_HOMEPAGE),
}
ua = UserAgent() #added this
headers['User-Agent'] = ua.ie #added this
然后它下载了一个pdf和一个xls,但由于request.py添加了一个标题,因此添加了一个标题,因此添加了一个标题,因此在request.py中添加了假名,并注释掉了默认标题,如下所示。
from fake_useragent import UserAgent
ub = UserAgent()
self.addheaders = [('User-Agent', ub.ie)]
# self.addheaders = [('User-Agent', self.version), ('Accept', '*/*')] [('User-Agent', self.version), ('Accept', '*/*')]
新错误如下。我无法解决如何进一步排查问题的方法。我怀疑它无法找到可能由Windows引起的文件/路径。
[download] https://youtube.com/watch?v=bKkrDLwDnDE => Downloaded\Implementing_ETL_with_SQL_Server_Integration_Services\02-Module_1__ETL_Processing\01-%(title)s-%(id)s.%(ext)s
Downloading video with URL https://youtube.com/watch?v=bKkrDLwDnDE from YouTube.
Traceback (most recent call last):
File "edx-dl.py", line 6, in <module>
edx_dl.main()
File "c:\edx-dl-master\edx-dl-master\edx_dl\edx_dl.py", line 1080, in main
download(args, selections, filtered_units, headers)
File "c:\edx-dl-master\edx-dl-master\edx_dl\edx_dl.py", line 857, in download
headers)
File "c:\edx-dl-master\edx-dl-master\edx_dl\edx_dl.py", line 819, in download_unit
headers)
File "c:\edx-dl-master\edx-dl-master\edx_dl\edx_dl.py", line 801, in download_video
skip_or_download(youtube_downloads, headers, args)
File "c:\edx-dl-master\edx-dl-master\edx_dl\edx_dl.py", line 788, in skip_or_download
f(url, filename, headers, args)
File "c:\edx-dl-master\edx-dl-master\edx_dl\edx_dl.py", line 721, in download_url
download_youtube_url(url, filename, headers, args)
File "c:\edx-dl-master\edx-dl-master\edx_dl\edx_dl.py", line 761, in download_youtube_url
execute_command(cmd, args)
File "c:\edx-dl-master\edx-dl-master\edx_dl\utils.py", line 37, in execute_command
subprocess.check_call(cmd)
File "C:\Users\anton\Anaconda3\lib\subprocess.py", line 286, in check_call
retcode = call(*popenargs, **kwargs)
File "C:\Users\anton\Anaconda3\lib\subprocess.py", line 267, in call
with Popen(*popenargs, **kwargs) as p:
File "C:\Users\anton\Anaconda3\lib\subprocess.py", line 709, in __init__
restore_signals, start_new_session)
File "C:\Users\anton\Anaconda3\lib\subprocess.py", line 997, in _execute_child
startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified
同样的问题,但是没有提供任何解决方案或协助,所以我想在此尝试。
https://github.com/coursera-dl/edx-dl/issues/368
有关如何学习如何解决此问题的建议将不胜感激。
答案 0 :(得分:1)
调试代码,发现无法找到youtube-dl
。
检查回声%PATH%
并意识到我有路径:
C:...\Anaconda3\
但不是C:...\Anaconda3\Scripts\ (this is location of youtube_dl.exe)
。
我添加了此路径但未重新启动。
重新启动并立即解决。
答案 1 :(得分:0)
还有另一种简单的解决方案,无需使用Fake_UserAgent
,只需使用其他下载器,例如wget
。
安装新的edx_dl。
如果您在Windows上下载wget,则将其保存在例如H驱动器上。
像这样更改download_url
函数:
def download_url(url, filename, headers, args):
"""
Downloads the given url in filename.
"""
if is_youtube_url(url):
download_youtube_url(url, filename, headers, args)
else:
# jcline
cmd = (["h:\wget.exe", url, '-c', '-O', filename, '--keep-session-cookies', '--no-check-certificate'])
execute_command(cmd, args)
(Source)