我已经安装了cloudera cdh4版本而我正在尝试运行mapreduce作业。我收到以下错误 - >
2012-07-09 15:41:16 ZooKeeperSaslClient [INFO] Client will not SASL-authenticate because the default JAAS configuration section 'Client' could not be found. If you are not using SASL, you may ignore this. On the other hand, if you expected SASL to work, please fix your JAAS configuration.
2012-07-09 15:41:16 ClientCnxn [INFO] Socket connection established to Cloudera/192.168.0.102:2181, initiating session
2012-07-09 15:41:16 RecoverableZooKeeper [WARN] Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/master
2012-07-09 15:41:16 RetryCounter [INFO] The 1 times to retry after sleeping 2000 ms
2012-07-09 15:41:16 ClientCnxn [INFO] Session establishment complete on server Cloudera/192.168.0.102:2181, sessionid = 0x1386b0b44da000b, negotiated timeout = 60000
2012-07-09 15:41:18 TableOutputFormat [INFO] Created table instance for exact_custodian
2012-07-09 15:41:18 NativeCodeLoader [WARN] Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2012-07-09 15:41:18 JobSubmitter [WARN] Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
2012-07-09 15:41:18 JobSubmitter [INFO] Cleaning up the staging area file:/tmp/hadoop-hdfs/mapred/staging/hdfs48876562/.staging/job_local_0001
2012-07-09 15:41:18 UserGroupInformation [ERROR] PriviledgedActionException as:hdfs (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not exist: /home/cloudera/yogesh/lib/hbase.jar
Exception in thread "main" java.io.FileNotFoundException: File does not exist: /home/cloudera/yogesh/lib/hbase.jar
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:736)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:246)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:284)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:355)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1226)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1223)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1223)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1244)
at
我能够运行hadoop-mapreduce-examples-2.0.0-cdh4.0.0.jar中给出的示例程序。 但是当我的工作成功提交给jobtracker时,我收到了这个错误。看起来它正在尝试再次访问本地文件系统(虽然我已经设置了分布式缓存中作业执行所需的所有库,但它仍尝试访问本地目录)。这个问题与用户权限有关吗?
I)
Cloudera:~ # hadoop fs -ls hdfs://<MyClusterIP>:8020/
显示 -
Found 8 items
drwxr-xr-x - hbase hbase 0 2012-07-04 17:58 hdfs://<MyClusterIP>:8020/hbase<br/>
drwxr-xr-x - hdfs supergroup 0 2012-07-05 16:21 hdfs://<MyClusterIP>:8020/input<br/>
drwxr-xr-x - hdfs supergroup 0 2012-07-05 16:21 hdfs://<MyClusterIP>:8020/output<br/>
drwxr-xr-x - hdfs supergroup 0 2012-07-06 16:03 hdfs:/<MyClusterIP>:8020/tools-lib<br/>
drwxr-xr-x - hdfs supergroup 0 2012-06-26 14:02 hdfs://<MyClusterIP>:8020/test<br/>
drwxrwxrwt - hdfs supergroup 0 2012-06-12 16:13 hdfs://<MyClusterIP>:8020/tmp<br/>
drwxr-xr-x - hdfs supergroup 0 2012-07-06 15:58 hdfs://<MyClusterIP>:8020/user<br/>
II)
---没有结果如下----
hdfs@Cloudera:/etc/hadoop/conf> find . -name '**' | xargs grep "default.name"<br/>
hdfs@Cloudera:/etc/hbase/conf> find . -name '**' | xargs grep "default.name"<br/>
相反,我认为我们正在使用新的APIS - &gt;
fs.defaultFS - &gt; hdfs:// Cloudera:8020我已正确设置
虽然对于“fs.default.name”我得到了hadoop集群0.20.2(非cloudera集群)的条目
cass-hadoop@Pratapgad:~/hadoop/conf> find . -name '**' | xargs grep "default.name"<br/>
./core-default.xml: <name>fs.default.name</name><br/>
./core-site.xml: <name>fs.default.name</name><br/>
我认为cdh4默认配置应该在相应的目录中添加此条目。 (如果它的错误)。
我用来运行我的程序的命令 -
hdfs@Cloudera:/home/cloudera/yogesh/lib> java -classpath hbase-tools.jar:hbase.jar:slf4j-log4j12-1.6.1.jar:slf4j-api-1.6.1.jar:protobuf-java-2.4.0a.jar:hadoop-common-2.0.0-cdh4.0.0.jar:hadoop-hdfs-2.0.0-cdh4.0.0.jar:hadoop-mapreduce-client-common-2.0.0-cdh4.0.0.jar:hadoop-mapreduce-client-core-2.0.0-cdh4.0.0.jar:log4j-1.2.16.jar:commons-logging-1.0.4.jar:commons-lang-2.5.jar:commons-lang3-3.1.jar:commons-cli-1.2.jar:commons-configuration-1.6.jar:guava-11.0.2.jar:google-collect-1.0-rc2.jar:google-collect-1.0-rc1.jar:hadoop-auth-2.0.0-cdh4.0.0.jar:hadoop-auth.jar:jackson.jar:avro-1.5.4.jar:hadoop-yarn-common-2.0.0-cdh4.0.0.jar:hadoop-yarn-api-2.0.0-cdh4.0.0.jar:hadoop-yarn-server-common-2.0.0-cdh4.0.0.jar:commons-httpclient-3.0.1.jar:commons-io-1.4.jar:zookeeper-3.3.2.jar:jdom.jar:joda-time-1.5.2.jar com.hbase.xyz.MyClassName
答案 0 :(得分:4)
即使我在运行MR作业时在2.0.0-cdh4.1.3中逐步解决了同样的问题。添加后 mapred.site.xml中的属性
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
用于运行Hive作业
export HIVE_USER=yarn
答案 1 :(得分:2)
调试过程:尝试运行简单的Hadoop shell命令。
hadoop fs -ls /
如果显示HDFS文件,那么您的配置是正确的。如果没有,则缺少配置。当发生这种情况时,像-ls
这样的hadoop shell命令将引用本地文件系统而不是Hadoop文件系统。
如果Hadoop使用CMS(Cloudera管理器)启动,则会发生这种情况。它没有在conf
目录中明确存储配置。
通过以下命令(更改端口)检查是否显示hadoop文件系统:
hadoop fs -ls hdfs:// host:8020 /
如果在将路径提交为/
时显示本地文件系统,则应在配置目录中设置配置文件hdfs-site.xml
和mapred-site.xml
。另外,hdfs-site.xml
应该有fs.default.name
的条目指向hdfs://host:port/
。就我而言,目录是/etc/hadoop/conf
。
请参阅:http://hadoop.apache.org/common/docs/r0.20.2/core-default.html
请参阅,如果这可以解决您的问题。