Sqoop作业从SQL服务器导入数据停留在Map 0%

时间:2014-07-01 22:46:17

标签: hadoop sqoop yarn

我有一个运行CDH5.0.2的伪分布式hadoop集群。我正在运行sqoop导入命令:

sudo -u sqoop sqoop import --connect "jdbc:sqlserver://x.x.x.x:1433;databaseName=yyyyy" --username x --password y --table table_name

我只是导入一个非常小的表,有12行和2列供测试。这份工作已经运行了半个小时。在我的资源管理器上,映射器任务的状态列为NEW,其状态列为SCHEDULED。我认为它不会运行!

当我使用以下方式列出纱线上的作业时:

yarn application -list

我得到了:

14/07/01 15:55:06 INFO client.RMProxy: Connecting to ResourceManager at host/x.x.x.x:8032
Total number of applications (application-types: [] and states: [SUBMITTED, ACCEPTED, RUNNING]):1
Application-Id      Application-Name        Application-Type          User           Queue                   State             Final-State             Progress                        Tracking-URL
application_1404252440376_0001      ActivityType.jar               MAPREDUCE         sqoop                   root.sqoop        RUNNING               UNDEFINED                   5%                  http://host:42583

这是我正在查看的应用程序主日志。我该如何解决这个问题?

2014-07-01 15:14:12,880 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
2014-07-01 15:14:12,885 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
2014-07-01 15:14:12,888 INFO [main] org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at host/x.x.x.x:8030
2014-07-01 15:14:12,973 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: maxContainerCapability: 1024
2014-07-01 15:14:12,973 INFO [main] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: queue: root.sqoop
2014-07-01 15:14:12,977 INFO [main] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Upper limit on the thread pool size is 500
2014-07-01 15:14:12,979 INFO [main] org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy: yarn.client.max-nodemanagers-proxies : 500
2014-07-01 15:14:12,985 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1404252440376_0001Job Transitioned from INITED to SETUP
2014-07-01 15:14:12,987 INFO [CommitterEvent Processor #0] org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Processing the event EventType: JOB_SETUP
2014-07-01 15:14:12,997 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1404252440376_0001Job Transitioned from SETUP to RUNNING
2014-07-01 15:14:13,018 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1404252440376_0001_m_000000 Task Transitioned from NEW to SCHEDULED
2014-07-01 15:14:13,019 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1404252440376_0001_m_000001 Task Transitioned from NEW to SCHEDULED
2014-07-01 15:14:13,019 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1404252440376_0001_m_000002 Task Transitioned from NEW to SCHEDULED
2014-07-01 15:14:13,019 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1404252440376_0001_m_000003 Task Transitioned from NEW to SCHEDULED
2014-07-01 15:14:13,021 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1404252440376_0001_m_000000_0 TaskAttempt Transitioned from NEW to UNASSIGNED
2014-07-01 15:14:13,021 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1404252440376_0001_m_000001_0 TaskAttempt Transitioned from NEW to UNASSIGNED
2014-07-01 15:14:13,021 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1404252440376_0001_m_000002_0 TaskAttempt Transitioned from NEW to UNASSIGNED
2014-07-01 15:14:13,021 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1404252440376_0001_m_000003_0 TaskAttempt Transitioned from NEW to UNASSIGNED
2014-07-01 15:14:13,022 INFO [Thread-51] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: mapResourceReqt:1024
2014-07-01 15:14:13,066 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Event Writer setup for JobId: job_1404252440376_0001, File: hdfs://host:8020/user/sqoop/.staging/job_1404252440376_0001/job_1404252440376_0001_1.jhist
2014-07-01 15:14:13,976 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Scheduling: PendingReds:0 ScheduledMaps:4 ScheduledReds:0 AssignedMaps:0 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:0 ContRel:0 HostLocal:0 RackLocal:0
2014-07-01 15:14:14,054 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() for application_1404252440376_0001: ask=1 release= 0 newContainers=0 finishedContainers=0 resourcelimit=<memory:0, vCores:0> knownNMs=1

2 个答案:

答案 0 :(得分:0)

我遇到的主要问题是,没有足够的资源来执行sqoop

当我执行 Sqoop 以及其他 YARN 应用程序时,它通常没有足够的资源,因此地图任务总是停留在0%。我去了驱动程序日志,日志的最后几行有:

2017-02-01 14:54:48,638 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1484947659248_0500_m_000000_0 TaskAttempt Transitioned from NEW to UNASSIGNED
2017-02-01 14:54:48,638 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1484947659248_0500_m_000001_0 TaskAttempt Transitioned from NEW to UNASSIGNED
2017-02-01 14:54:48,638 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1484947659248_0500_m_000002_0 TaskAttempt Transitioned from NEW to UNASSIGNED
2017-02-01 14:54:48,638 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1484947659248_0500_m_000003_0 TaskAttempt Transitioned from NEW to UNASSIGNED
2017-02-01 14:54:48,639 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1484947659248_0500_m_000004_0 TaskAttempt Transitioned from NEW to UNASSIGNED
2017-02-01 14:54:48,639 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1484947659248_0500_m_000005_0 TaskAttempt Transitioned from NEW to UNASSIGNED

在此之后,没有记录任何内容,并且sqoop仍为0%。

YARN 上没有其他内容正在运行时, Sqoop 执行时没有任何问题。

答案 1 :(得分:-1)

这不是最有帮助的答案,但我最终重新安装了整个事情。