打开edx见解配置错误:没有名为mechanize

时间:2018-10-23 18:24:29

标签: java python django hadoop openedx

我正在尝试在ubuntu实例上配置开放式edx见解。

我从以下步骤开始: https://openedx.atlassian.net/wiki/spaces/OpenOPS/pages/43385371/edX+Analytics+Installation

安装正常进行到第5步,之后,当我尝试运行CourseEnrollmentEventsTask时,出现以下错误

  

远程任务--host本地主机-用户ubuntu-远程名称analyticstack   --skip-setup --wait CourseEnrollmentEventsTask --local-scheduler --interval 2018 --verbose --override-config /home/ubuntu/edx-analytics-pipeline/config/devstack.cfg   --n-reduce-tasks 8 edx-analytics-pipeline / marker / -4077723021222861505-temp-2018-10-23T16-46-24.874897   2018-10-23 16:46:28,981信息25647 [luigi-interface] hadoop.py:339-   18/10/23 16:46:28 WARN stream.StreamJob:-file选项为   不推荐使用,请改用通用选项-files。 2018-10-23   16:46:31,823 INFO 25647 [luigi-interface] hadoop.py:339-18/10/23   16:46:31 INFO client.RMProxy:在以下位置连接到ResourceManager   /0.0.0.0:8032 2018-10-23 16:46:32,152 INFO 25647 [luigi-interface]   hadoop.py:339-18/10/23 16:46:32 INFO client.RMProxy:连接到   ResourceManager位于/0.0.0.0:8032 2018-10-23 16:46:34,331 INFO 25647   [luigi-interface] hadoop.py:339-18/10/23 16:46:34信息   mapred.FileInputFormat:要处理的总输入路径:211 2018-10-23   16:46:35,220 INFO 25647 [luigi-interface] hadoop.py:339-18/10/23   16:46:35 INFO mapreduce.JobSubmitter:分割数:211 2018-10-23   16:46:35,241 INFO 25647 [luigi-interface] hadoop.py:339-18/10/23   16:46:35 INFO配置。弃用:mapred.job.name是   不推荐使用。而是使用mapreduce.job.name 2018-10-23 16:46:35,242   INFO 25647 [luigi-interface] hadoop.py:339-18/10/23 16:46:35 INFO   Configuration.deprecation:不建议使用mapred.reduce.tasks。代替,   使用mapreduce.job.reduces 2018-10-23 16:46:35,401 INFO 25647   [luigi-interface] hadoop.py:339-18/10/23 16:46:35信息   mapreduce.JobSubmitter:提交作业令牌:   job_1540309647275_0006 2018-10-23 16:46:35,720信息25647   [luigi-interface] hadoop.py:339-18/10/23 16:46:35信息   impl.YarnClientImpl:提交的应用程序   application_1540309647275_0006 2018-10-23 16:46:35,786信息25647   [luigi-interface] hadoop.py:339-18/10/23 16:46:35信息   mapreduce.Job:跟踪作业的网址:   http://localhost:8088/proxy/application_1540309647275_0006/ 2018-10-23   16:46:35,793 INFO 25647 [luigi-interface] hadoop.py:339-18/10/23   16:46:35 INFO mapreduce。工作:正在运行的工作:job_1540309647275_0006   2018-10-23 17:09:24,334 INFO 25647 [luigi-interface] hadoop.py:339-   18/10/23 17:09:24 INFO mapreduce。工作:工作job_1540309647275_0006   在uber模式下运行:false 2018-10-23 17:09:24,337 INFO 25647   [luigi-interface] hadoop.py:339-18/10/23 17:09:24信息   mapreduce.Job:map 0%减少0%2018-10-23 17:09:24,353 INFO 25647   [luigi-interface] hadoop.py:339-18/10/23 17:09:24信息   mapreduce.Job:作业job_1540309647275_0006失败,状态为KILLED   到:应用程序被用户杀死。 2018-10-23 17:09:24,385信息25647   [luigi-interface] hadoop.py:339-18/10/23 17:09:24信息   mapreduce.Job:计数器:0 2018-10-23 17:09:24,385 INFO 25647   [luigi-interface] hadoop.py:339-18/10/23 17:09:24错误   streaming.StreamJob:作业不成功! 2018-10-23 17:09:24,386信息   25647 [luigi-interface] hadoop.py:339-流命令失败!   2018-10-23 17:09:24,732错误25647 [luigi-interface] worker.py:213-   [pid 25647] Worker Worker(盐= 800759884,workers = 1,host = localhost,   username = hadoop,pid = 25647,sudo_user = root)失败
  CourseEnrollmentEventsTask(source = [“ hdfs:// localhost:9000 / data /”],   interval = 2018,expand_interval = 0 w 2 d 0 h 0 m 0 s,   pattern = [“。 tracking.log。”],date_pattern =%Y%m%d,   Warehouse_path = hdfs:// localhost:9000 / edx-analytics-pipeline / warehouse /)   追溯(最近一次通话):文件   “ /var/lib/analytics-tasks/analyticstack/venv/src/luigi/luigi/worker.py”,   194行,正在运行       new_deps = self._run_get_new_deps()文件“ /var/lib/analytics-tasks/analyticstack/venv/src/luigi/luigi/worker.py”,   第131行,在_run_get_new_deps中       task_gen = self.task.run()文件“ /var/lib/analytics-tasks/analyticstack/venv/local/lib/python2.7/site-packages/edx/analytics/tasks/insights/enrollments.py”,   152行,运行中       super(CourseEnrollmentEventsTask,self).run()文件“ /var/lib/analytics-tasks/analyticstack/venv/src/luigi/luigi/contrib/hadoop.py”,   781行       self.job_runner()。run_job(self)文件“ /var/lib/analytics-tasks/analyticstack/venv/src/luigi/luigi/contrib/hadoop.py”,   第622行,在run_job中       run_and_track_hadoop_job(arglist,tracking_url_callback = job.set_tracking_url)文件   “ /var/lib/analytics-tasks/analyticstack/venv/src/luigi/luigi/contrib/hadoop.py”,   第390行,在run_and_track_hadoop_job中       返回track_process(arglist,tracking_url_callback,env)文件“ /var/lib/analytics-tasks/analyticstack/venv/src/luigi/luigi/contrib/hadoop.py”,   第380行,在track_process中       (tracking_url,e),out,err)HadoopJobError:流作业失败,退出代码为1。此外,获取数据时发生错误   来自http://localhost:8088/proxy/application_1540309647275_0006/:否   名为mechanize的模块2018-10-23 17:09:24,751 INFO 25647   [luigi-interface] worker.py:501-通知该任务的调度程序
  CourseEnrollmentEventsTask__Y_m_d_0_w_2_d_0_h_0_m__2018_4fba0fee90
  状态为FAILED 2018-10-23 17:09:24,789 INFO 25647   [luigi-interface] worker.py:401-Worker Worker(盐= 800759884,   worker = 1,host = localhost,username = hadoop,pid = 25647,sudo_user = root)   被停止了。关闭Keep-Alive线程2018-10-23 17:09:24,794   INFO 25647 [luigi-interface] interface.py:208-

Luigi执行摘要

计划的2个任务:

  • 遇到了1个当前依赖项:
    • 1 PathSelectionByDateIntervalTask​​(source = [“ hdfs:// localhost:9000 / data /”],时间间隔= 2018,expand_interval = 0 w 2 d 0 h 0 m 0 s,pattern = [“。跟踪。日志。“],date_pattern =%Y%m%d)
  • 1个失败:
    • 1 CourseEnrollmentEventsTask(...)

此进度看起来:(因为有失败的任务

Luigi执行摘要

与本地主机的连接已关闭。 退出状态为30

0 个答案:

没有答案