我正在尝试在ubuntu实例上配置开放式edx见解。
我从以下步骤开始: https://openedx.atlassian.net/wiki/spaces/OpenOPS/pages/43385371/edX+Analytics+Installation
安装正常进行到第5步,之后,当我尝试运行CourseEnrollmentEventsTask时,出现以下错误
远程任务--host本地主机-用户ubuntu-远程名称analyticstack --skip-setup --wait CourseEnrollmentEventsTask --local-scheduler --interval 2018 --verbose --override-config /home/ubuntu/edx-analytics-pipeline/config/devstack.cfg --n-reduce-tasks 8 edx-analytics-pipeline / marker / -4077723021222861505-temp-2018-10-23T16-46-24.874897 2018-10-23 16:46:28,981信息25647 [luigi-interface] hadoop.py:339- 18/10/23 16:46:28 WARN stream.StreamJob:-file选项为 不推荐使用,请改用通用选项-files。 2018-10-23 16:46:31,823 INFO 25647 [luigi-interface] hadoop.py:339-18/10/23 16:46:31 INFO client.RMProxy:在以下位置连接到ResourceManager /0.0.0.0:8032 2018-10-23 16:46:32,152 INFO 25647 [luigi-interface] hadoop.py:339-18/10/23 16:46:32 INFO client.RMProxy:连接到 ResourceManager位于/0.0.0.0:8032 2018-10-23 16:46:34,331 INFO 25647 [luigi-interface] hadoop.py:339-18/10/23 16:46:34信息 mapred.FileInputFormat:要处理的总输入路径:211 2018-10-23 16:46:35,220 INFO 25647 [luigi-interface] hadoop.py:339-18/10/23 16:46:35 INFO mapreduce.JobSubmitter:分割数:211 2018-10-23 16:46:35,241 INFO 25647 [luigi-interface] hadoop.py:339-18/10/23 16:46:35 INFO配置。弃用:mapred.job.name是 不推荐使用。而是使用mapreduce.job.name 2018-10-23 16:46:35,242 INFO 25647 [luigi-interface] hadoop.py:339-18/10/23 16:46:35 INFO Configuration.deprecation:不建议使用mapred.reduce.tasks。代替, 使用mapreduce.job.reduces 2018-10-23 16:46:35,401 INFO 25647 [luigi-interface] hadoop.py:339-18/10/23 16:46:35信息 mapreduce.JobSubmitter:提交作业令牌: job_1540309647275_0006 2018-10-23 16:46:35,720信息25647 [luigi-interface] hadoop.py:339-18/10/23 16:46:35信息 impl.YarnClientImpl:提交的应用程序 application_1540309647275_0006 2018-10-23 16:46:35,786信息25647 [luigi-interface] hadoop.py:339-18/10/23 16:46:35信息 mapreduce.Job:跟踪作业的网址: http://localhost:8088/proxy/application_1540309647275_0006/ 2018-10-23 16:46:35,793 INFO 25647 [luigi-interface] hadoop.py:339-18/10/23 16:46:35 INFO mapreduce。工作:正在运行的工作:job_1540309647275_0006 2018-10-23 17:09:24,334 INFO 25647 [luigi-interface] hadoop.py:339- 18/10/23 17:09:24 INFO mapreduce。工作:工作job_1540309647275_0006 在uber模式下运行:false 2018-10-23 17:09:24,337 INFO 25647 [luigi-interface] hadoop.py:339-18/10/23 17:09:24信息 mapreduce.Job:map 0%减少0%2018-10-23 17:09:24,353 INFO 25647 [luigi-interface] hadoop.py:339-18/10/23 17:09:24信息 mapreduce.Job:作业job_1540309647275_0006失败,状态为KILLED 到:应用程序被用户杀死。 2018-10-23 17:09:24,385信息25647 [luigi-interface] hadoop.py:339-18/10/23 17:09:24信息 mapreduce.Job:计数器:0 2018-10-23 17:09:24,385 INFO 25647 [luigi-interface] hadoop.py:339-18/10/23 17:09:24错误 streaming.StreamJob:作业不成功! 2018-10-23 17:09:24,386信息 25647 [luigi-interface] hadoop.py:339-流命令失败! 2018-10-23 17:09:24,732错误25647 [luigi-interface] worker.py:213- [pid 25647] Worker Worker(盐= 800759884,workers = 1,host = localhost, username = hadoop,pid = 25647,sudo_user = root)失败
CourseEnrollmentEventsTask(source = [“ hdfs:// localhost:9000 / data /”], interval = 2018,expand_interval = 0 w 2 d 0 h 0 m 0 s, pattern = [“。 tracking.log。”],date_pattern =%Y%m%d, Warehouse_path = hdfs:// localhost:9000 / edx-analytics-pipeline / warehouse /) 追溯(最近一次通话):文件 “ /var/lib/analytics-tasks/analyticstack/venv/src/luigi/luigi/worker.py”, 194行,正在运行 new_deps = self._run_get_new_deps()文件“ /var/lib/analytics-tasks/analyticstack/venv/src/luigi/luigi/worker.py”, 第131行,在_run_get_new_deps中 task_gen = self.task.run()文件“ /var/lib/analytics-tasks/analyticstack/venv/local/lib/python2.7/site-packages/edx/analytics/tasks/insights/enrollments.py”, 152行,运行中 super(CourseEnrollmentEventsTask,self).run()文件“ /var/lib/analytics-tasks/analyticstack/venv/src/luigi/luigi/contrib/hadoop.py”, 781行 self.job_runner()。run_job(self)文件“ /var/lib/analytics-tasks/analyticstack/venv/src/luigi/luigi/contrib/hadoop.py”, 第622行,在run_job中 run_and_track_hadoop_job(arglist,tracking_url_callback = job.set_tracking_url)文件 “ /var/lib/analytics-tasks/analyticstack/venv/src/luigi/luigi/contrib/hadoop.py”, 第390行,在run_and_track_hadoop_job中 返回track_process(arglist,tracking_url_callback,env)文件“ /var/lib/analytics-tasks/analyticstack/venv/src/luigi/luigi/contrib/hadoop.py”, 第380行,在track_process中 (tracking_url,e),out,err)HadoopJobError:流作业失败,退出代码为1。此外,获取数据时发生错误 来自http://localhost:8088/proxy/application_1540309647275_0006/:否 名为mechanize的模块2018-10-23 17:09:24,751 INFO 25647 [luigi-interface] worker.py:501-通知该任务的调度程序
CourseEnrollmentEventsTask__Y_m_d_0_w_2_d_0_h_0_m__2018_4fba0fee90
状态为FAILED 2018-10-23 17:09:24,789 INFO 25647 [luigi-interface] worker.py:401-Worker Worker(盐= 800759884, worker = 1,host = localhost,username = hadoop,pid = 25647,sudo_user = root) 被停止了。关闭Keep-Alive线程2018-10-23 17:09:24,794 INFO 25647 [luigi-interface] interface.py:208-
Luigi执行摘要
计划的2个任务:
此进度看起来:(因为有失败的任务
Luigi执行摘要
与本地主机的连接已关闭。 退出状态为30