在CDH4生态系统中,我正在尝试将mapreduce作业输出到hbase表。由于某种原因,在配置设置的addDependencyJars调用期间失败。
据我所知,hbase配置没有获取hadoop配置(请参阅job输出中的警告)。我提供了hdfs-site.xml,作业配置,带有堆栈跟踪的作业输出和文件权限。
非常感谢任何有关如何进一步调试的帮助或见解。
hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<!-- replication configuration -->
<property>
<name>dfs.permissions.enabled</name>
<value>false</value>
</property>
<property>
<name>dfs.permissions.superusergroup</name>
<value>hadoop</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/var/hadoop/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/var/hadoop/datanode</value>
</property>
</configuration>
//作业配置
Configuration conf = HBaseConfiguration.create();
Job job = new Job(conf);
job.setJarByClass(LocalCsvCdrHbaseJob.class);
job.setJobName("Local CVS CDR Venue Session Analysis to hbase");
job.setMapOutputKeyClass(IntWritable.class);
job.setMapOutputValueClass(VenueSession.class);
job.setMapperClass(VenueMapper.class);
job.setReducerClass(VenueSessionCountHbaseReducer.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TableOutputFormat.class);
FileInputFormat.setInputPaths(job, new Path(args[0]));
TableMapReduceUtil.initTableReducerJob("venue_session", VenueSessionCountHbaseReducer.class, job);
TableMapReduceUtil.addDependencyJars(job);
job.waitForCompletion(true);
hbase类路径肯定包含hadoop conf目录(etc / hadoop / conf)。
:~ # sudo -u mapred HADOOP_CLASSPATH=`hbase classpath` hadoop jar /home/mapred/cdr-hadoop-0.0.0-SNAPSHOT.jar net.thecloud.bi.cdr.jobs.LocalCsvCdrHbaseJob /cdr-venue-sessions/2013-05-22.cdr.csv
13/08/08 11:03:12 WARN conf.Configuration: dfs.df.interval is deprecated. Instead, use fs.df.interval
13/08/08 11:03:12 WARN conf.Configuration: dfs.max.objects is deprecated. Instead, use dfs.namenode.max.objects
13/08/08 11:03:12 WARN conf.Configuration: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
13/08/08 11:03:12 WARN conf.Configuration: dfs.data.dir is deprecated. Instead, use dfs.datanode.data.dir
13/08/08 11:03:12 WARN conf.Configuration: dfs.name.dir is deprecated. Instead, use dfs.namenode.name.dir
13/08/08 11:03:12 WARN conf.Configuration: fs.default.name is deprecated. Instead, use fs.defaultFS
13/08/08 11:03:12 WARN conf.Configuration: fs.checkpoint.dir is deprecated. Instead, use dfs.namenode.checkpoint.dir
13/08/08 11:03:12 WARN conf.Configuration: dfs.block.size is deprecated. Instead, use dfs.blocksize
13/08/08 11:03:12 WARN conf.Configuration: dfs.access.time.precision is deprecated. Instead, use dfs.namenode.accesstime.precision
13/08/08 11:03:12 WARN conf.Configuration: dfs.replication.min is deprecated. Instead, use dfs.namenode.replication.min
13/08/08 11:03:12 WARN conf.Configuration: dfs.name.edits.dir is deprecated. Instead, use dfs.namenode.edits.dir
13/08/08 11:03:12 WARN conf.Configuration: dfs.replication.considerLoad is deprecated. Instead, use dfs.namenode.replication.considerLoad
13/08/08 11:03:12 WARN conf.Configuration: dfs.balance.bandwidthPerSec is deprecated. Instead, use dfs.datanode.balance.bandwidthPerSec
13/08/08 11:03:12 WARN conf.Configuration: dfs.safemode.threshold.pct is deprecated. Instead, use dfs.namenode.safemode.threshold-pct
13/08/08 11:03:12 WARN conf.Configuration: dfs.http.address is deprecated. Instead, use dfs.namenode.http-address
13/08/08 11:03:12 WARN conf.Configuration: dfs.name.dir.restore is deprecated. Instead, use dfs.namenode.name.dir.restore
13/08/08 11:03:12 WARN conf.Configuration: dfs.https.client.keystore.resource is deprecated. Instead, use dfs.client.https.keystore.resource
13/08/08 11:03:12 WARN conf.Configuration: dfs.backup.address is deprecated. Instead, use dfs.namenode.backup.address
13/08/08 11:03:12 WARN conf.Configuration: dfs.backup.http.address is deprecated. Instead, use dfs.namenode.backup.http-address
13/08/08 11:03:12 WARN conf.Configuration: dfs.permissions is deprecated. Instead, use dfs.permissions.enabled
13/08/08 11:03:12 WARN conf.Configuration: dfs.safemode.extension is deprecated. Instead, use dfs.namenode.safemode.extension
13/08/08 11:03:12 WARN conf.Configuration: dfs.datanode.max.xcievers is deprecated. Instead, use dfs.datanode.max.transfer.threads
13/08/08 11:03:12 WARN conf.Configuration: dfs.https.need.client.auth is deprecated. Instead, use dfs.client.https.need-auth
13/08/08 11:03:12 WARN conf.Configuration: dfs.https.address is deprecated. Instead, use dfs.namenode.https-address
13/08/08 11:03:12 WARN conf.Configuration: dfs.replication.interval is deprecated. Instead, use dfs.namenode.replication.interval
13/08/08 11:03:12 WARN conf.Configuration: fs.checkpoint.edits.dir is deprecated. Instead, use dfs.namenode.checkpoint.edits.dir
13/08/08 11:03:12 WARN conf.Configuration: dfs.write.packet.size is deprecated. Instead, use dfs.client-write-packet-size
13/08/08 11:03:12 WARN conf.Configuration: dfs.permissions.supergroup is deprecated. Instead, use dfs.permissions.superusergroup
13/08/08 11:03:12 WARN conf.Configuration: topology.script.number.args is deprecated. Instead, use net.topology.script.number.args
13/08/08 11:03:12 WARN conf.Configuration: dfs.umaskmode is deprecated. Instead, use fs.permissions.umask-mode
13/08/08 11:03:12 WARN conf.Configuration: dfs.secondary.http.address is deprecated. Instead, use dfs.namenode.secondary.http-address
13/08/08 11:03:12 WARN conf.Configuration: fs.checkpoint.period is deprecated. Instead, use dfs.namenode.checkpoint.period
13/08/08 11:03:12 WARN conf.Configuration: topology.node.switch.mapping.impl is deprecated. Instead, use net.topology.node.switch.mapping.impl
13/08/08 11:03:12 WARN conf.Configuration: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
Exception in thread "main" java.io.IOException: java.lang.RuntimeException: java.io.IOException: Permission denied
at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.findOrCreateJar(TableMapReduceUtil.java:598)
at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.addDependencyJars(TableMapReduceUtil.java:549)
at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.addDependencyJars(TableMapReduceUtil.java:513)
at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableReducerJob(TableMapReduceUtil.java:456)
at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableReducerJob(TableMapReduceUtil.java:393)
at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableReducerJob(TableMapReduceUtil.java:363)
at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableReducerJob(TableMapReduceUtil.java:346)
at net.thecloud.bi.cdr.jobs.LocalCsvCdrHbaseJob.main(LocalCsvCdrHbaseJob.java:46)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
Caused by: java.lang.RuntimeException: java.io.IOException: Permission denied
at org.apache.hadoop.util.JarFinder.getJar(JarFinder.java:164)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.findOrCreateJar(TableMapReduceUtil.java:595)
... 12 more
Caused by: java.io.IOException: Permission denied
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.checkAndCreate(File.java:1704)
at java.io.File.createTempFile(File.java:1792)
at org.apache.hadoop.util.JarFinder.getJar(JarFinder.java:156)
... 17 more
文件权限
:~ # ls -l /var/hadoop/
total 12
drwxrwxrwx 2 hdfs hdfs 4096 Aug 8 09:23 datanode
drwxrwxrwx 3 mapred hadoop 4096 Aug 8 09:41 mapred
drwxrwxrwx 3 hdfs hdfs 4096 Aug 8 09:59 namenode
hdfs权限
:~ # hdfs dfs -ls -R /
drwxrwxrwx - hdfs hadoop 0 2013-08-08 09:36 /cdr-venue-sessions
-rw-rw-rw- 3 hdfs hadoop 27014304 2013-08-08 09:36 /cdr-venue-sessions/2013-05-22.cdr.csv
drwxrwxrwx - hbase hadoop 0 2013-08-08 10:10 /hbase
drwxrwxrwx - hbase hadoop 0 2013-08-08 10:07 /hbase/.logs
drwxrwxrwx - hbase hadoop 0 2013-08-08 10:06 /hbase/.oldlogs
drwxrwxrwx - hbase hadoop 0 2013-08-08 10:10 /hbase/.tmp
-rw-rw-rw- 3 hbase hadoop 38 2013-08-08 10:06 /hbase/hbase.id
-rw-rw-rw- 3 hbase hadoop 3 2013-08-08 10:06 /hbase/hbase.version
drwxrwxrwx - hbase hadoop 0 2013-08-08 10:10 /hbase/venue_session
-rw-rw-rw- 3 hbase hadoop 711 2013-08-08 10:10 /hbase/venue_session/.tableinfo.0000000001
drwxrwxrwx - hbase hadoop 0 2013-08-08 10:10 /hbase/venue_session/.tmp
drwxrwxrwx - hbase hadoop 0 2013-08-08 10:10 /hbase/venue_session/5cd64eee2dea6b1464023f24eee3daf0
-rw-rw-rw- 3 hbase hadoop 246 2013-08-08 10:10 /hbase/venue_session/5cd64eee2dea6b1464023f24eee3daf0/.regioninfo
drwxrwxrwx - hbase hadoop 0 2013-08-08 10:10 /hbase/venue_session/5cd64eee2dea6b1464023f24eee3daf0/values
drwxrwxrwt - hdfs hadoop 0 2013-08-08 09:41 /tmp
drwxrwxrwx - mapred hadoop 0 2013-08-08 09:41 /tmp/hadoop-mapred
drwxrwxrwx - mapred hadoop 0 2013-08-08 09:41 /tmp/hadoop-mapred/mapred
drwxrwxrwx - mapred hadoop 0 2013-08-08 10:06 /tmp/hadoop-mapred/mapred/system
-rw-rw-rw- 3 mapred hadoop 4 2013-08-08 10:06 /tmp/hadoop-mapred/mapred/system/jobtracker.info
drwxrwxrwx - hdfs hadoop 0 2013-08-08 09:30 /user-venue-types
drwxrwxrwx - hdfs hadoop 0 2013-08-08 09:28 /var
drwxrwxrwx - hdfs hadoop 0 2013-08-08 09:28 /var/hadoop
drwxrwxrwx - mapred hadoop 0 2013-08-08 09:28 /var/hadoop/mapred
drwxrwxrwx - hdfs hadoop 0 2013-08-08 09:27 /var/lib
drwxrwxrwx - hdfs hadoop 0 2013-08-08 09:27 /var/lib/hadoop-hdfs
drwxrwxrwx - hdfs hadoop 0 2013-08-08 09:27 /var/lib/hadoop-hdfs/cache
drwxrwxrwx - mapred hadoop 0 2013-08-08 09:27 /var/lib/hadoop-hdfs/cache/mapred
drwxrwxrwx - mapred hadoop 0 2013-08-08 09:27 /var/lib/hadoop-hdfs/cache/mapred/mapred
drwxrwxrwt - mapred hadoop 0 2013-08-08 09:27 /var/lib/hadoop-hdfs/cache/mapred/mapred/staging
drwxrwxrwx - hdfs hadoop 0 2013-08-08 09:30 /venues
答案 0 :(得分:1)
Hadoop中的权限通常不那么容易。几个调试点:
这些问题可能对您有用: