我尝试从hive表加载数据并将数据放入另一个表中。 从表
加载数据CREATE TABLE `dmg_bindings`(
`viuserid` string,
`puid` string,
`ts` bigint)
PARTITIONED BY (
`dt` string,
`pid` string)
将数据放入
CREATE TABLE `newdmgbnd`(
`ts` int,
`puid1` string,
`puid2` string)
PARTITIONED BY (
`dt` string,
`platid1` string,
`platid2` string)
但我有一个问题,找不到我错的地方。 我有下一个错误:
15/01/15 10:22:07 WARN conf.HiveConf: DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore.
15/01/15 10:22:07 INFO hive.metastore: Trying to connect to metastore with URI thrift://srv112.test.local:9083
15/01/15 10:22:07 INFO hive.metastore: Connected to metastore.
15/01/15 10:22:08 WARN conf.HiveConf: DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore.
15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@6d88b065] nullstring=\N
15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@6e205d5c] nullstring=\N
15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@5b031819] nullstring=\N
15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@223e0fa1] nullstring=\N
15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@1d73aa82] nullstring=\N
15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@1b10b8a3] nullstring=\N
15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@506422f2] nullstring=\N
15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@3f0eca9f] nullstring=\N
15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@da24f04] nullstring=\N
15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@6ad66647] nullstring=\N
15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@2469fb45] nullstring=\N
15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@2b2b5f52] nullstring=\N
15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@4ba6fc80] nullstring=\N
15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@2a5c3214] nullstring=\N
15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@666e18bb] nullstring=\N
15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@6a974e] nullstring=\N
15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@2c09f7be] nullstring=\N
15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@362239c7] nullstring=\N
15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@7ac85bb5] nullstring=\N
15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@4d9e25f] nullstring=\N
15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@1a74fc3d] nullstring=\N
15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@17c02eb9] nullstring=\N
15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@847ac3e] nullstring=\N
15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@656a0389] nullstring=\N
15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@f775a5b] nullstring=\N
15/01/15 10:22:08 INFO columnar.ColumnarSerDe: ColumnarSerDe initialized with: columnNames=[viuserid, puid, ts] columnTypes=[string, string, bigint] separator=[[B@53ef7ba0] nullstring=\N
15/01/15 10:22:08 WARN conf.HiveConf: DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore.
15/01/15 10:22:09 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
15/01/15 10:22:10 WARN conf.HiveConf: DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore.
15/01/15 10:22:10 INFO Configuration.deprecation: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
15/01/15 10:22:10 INFO mapred.FileInputFormat: Total input paths to process : 40
15/01/15 10:22:10 INFO mapred.FileInputFormat: Total input paths to process : 40
15/01/15 10:22:10 INFO mapred.FileInputFormat: Total input paths to process : 40
15/01/15 10:22:10 INFO mapred.FileInputFormat: Total input paths to process : 2
15/01/15 10:22:10 INFO mapred.FileInputFormat: Total input paths to process : 40
15/01/15 10:22:10 INFO mapred.FileInputFormat: Total input paths to process : 1
15/01/15 10:22:10 INFO mapred.FileInputFormat: Total input paths to process : 40
15/01/15 10:22:10 INFO mapred.FileInputFormat: Total input paths to process : 40
15/01/15 10:22:10 INFO mapred.FileInputFormat: Total input paths to process : 40
15/01/15 10:22:10 INFO mapred.FileInputFormat: Total input paths to process : 40
15/01/15 10:22:10 INFO mapred.FileInputFormat: Total input paths to process : 1
15/01/15 10:22:10 INFO mapred.FileInputFormat: Total input paths to process : 1
15/01/15 10:22:10 INFO mapred.FileInputFormat: Total input paths to process : 40
15/01/15 10:22:10 INFO mapred.FileInputFormat: Total input paths to process : 2
15/01/15 10:22:10 INFO mapred.FileInputFormat: Total input paths to process : 40
15/01/15 10:22:11 INFO mapred.FileInputFormat: Total input paths to process : 1
15/01/15 10:22:11 INFO mapred.FileInputFormat: Total input paths to process : 40
15/01/15 10:22:11 INFO mapred.FileInputFormat: Total input paths to process : 40
15/01/15 10:22:11 INFO mapred.FileInputFormat: Total input paths to process : 16
15/01/15 10:22:11 INFO mapred.FileInputFormat: Total input paths to process : 1
15/01/15 10:22:11 INFO mapred.FileInputFormat: Total input paths to process : 1
15/01/15 10:22:11 INFO mapred.FileInputFormat: Total input paths to process : 40
15/01/15 10:22:11 INFO mapred.FileInputFormat: Total input paths to process : 40
15/01/15 10:22:11 INFO mapred.FileInputFormat: Total input paths to process : 40
15/01/15 10:22:11 INFO mapred.FileInputFormat: Total input paths to process : 1
15/01/15 10:22:11 INFO mapred.FileInputFormat: Total input paths to process : 1
15/01/15 10:22:12 INFO mapred.JobClient: Running job: job_201412021320_0142
15/01/15 10:22:13 INFO mapred.JobClient: map 0% reduce 0%
15/01/15 10:22:24 INFO mapred.JobClient: Task Id : attempt_201412021320_0142_m_000002_0, Status : FAILED
java.lang.NullPointerException
at org.apache.hive.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:167)
at org.apache.hive.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:53)
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:558)
at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:106)
at MapNewDmg.map(MapNewDmg.java:32)
at MapNewDmg.map(MapNewDmg.java:15)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(Use
attempt_201412021320_0142_m_000002_0: SLF4J: Class path contains multiple SLF4J bindings.
attempt_201412021320_0142_m_000002_0: SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
attempt_201412021320_0142_m_000002_0: SLF4J: Found binding in [jar:file:/mnt1/mapred/local/taskTracker/mvolosnikova/jobcache/job_201412021320_0142/jars/job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
attempt_201412021320_0142_m_000002_0: SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
attempt_201412021320_0142_m_000002_0: SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
我的代码Driver.class。
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.io.WritableComparable;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.MultipleInputs;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
import org.apache.hive.hcatalog.data.DefaultHCatRecord;
import org.apache.hive.hcatalog.data.schema.HCatFieldSchema;
import org.apache.hive.hcatalog.data.schema.HCatSchema;
import org.apache.hive.hcatalog.mapreduce.HCatInputFormat;
import org.apache.hive.hcatalog.mapreduce.HCatOutputFormat;
import org.apache.hive.hcatalog.mapreduce.InputJobInfo;
import org.apache.hive.hcatalog.mapreduce.OutputJobInfo;
import java.io.FileInputStream;
import java.text.DateFormat;
import java.text.SimpleDateFormat;
import java.util.*;
public class Driver extends Configured implements Tool{
@Override
public int run(String[] strings) throws Exception {
Configuration conf = getConf();
Job job = Job.getInstance(conf, "newDmg");
HCatInputFormat.setInput(job, "default", "dmg_bindings", "dt=\"2014-09-01\"");
job.setJarByClass(Driver.class);
job.setMapperClass(MapNewDmg.class);
job.setNumReduceTasks(0);
job.setInputFormatClass(HCatInputFormat.class);
job.setOutputKeyClass(WritableComparable.class);
job.setOutputValueClass(DefaultHCatRecord.class);
job.setOutputFormatClass(HCatOutputFormat.class);
Map staticPartitions = new HashMap<String, String>(1);
staticPartitions.put("dt", "2014-09-01");
List dynamicPartitions = new ArrayList<String>();
dynamicPartitions.add("platid1");
dynamicPartitions.add("platid2");
OutputJobInfo jobInfo = OutputJobInfo.create("default", "newdmgbnd", staticPartitions);
jobInfo.setDynamicPartitioningKeys(dynamicPartitions);
HCatOutputFormat.setOutput(job, jobInfo);
HCatSchema schema = HCatOutputFormat.getTableSchema(job);
schema.append(new HCatFieldSchema("platid1", HCatFieldSchema.Type.STRING, ""));
schema.append(new HCatFieldSchema("platid2", HCatFieldSchema.Type.STRING, ""));
HCatOutputFormat.setSchema(job, schema);
return job.waitForCompletion(true) ? 0 : 1;
}
public static void main(String[] args) throws Exception {
int exitcode = ToolRunner.run(new Driver(), args);
System.exit(exitcode);
}
}
我的代码Mapper.class。
import org.apache.hadoop.io.WritableComparable;
import org.apache.hive.hcatalog.data.DefaultHCatRecord;
import org.apache.hive.hcatalog.data.HCatRecord;
import org.apache.hive.hcatalog.data.schema.HCatFieldSchema;
import org.apache.hive.hcatalog.data.schema.HCatSchema;
import org.apache.hive.hcatalog.mapreduce.HCatInputFormat;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.mapreduce.Mapper;
public class MapNewDmg extends Mapper<WritableComparable, HCatRecord, WritableComparable, HCatRecord> {
@Override
protected void map(WritableComparable key, HCatRecord value, Context context) throws IOException, InterruptedException {
String viuserid = (String) value.get(0);
String puid = (String) value.get(1);
Long ts = (Long) value.get(2);
String pid = (String) value.get(4);
int newts = (int) (ts / 1000);
HCatRecord record = new DefaultHCatRecord(6);
record.set(0, newts);
record.set(1, viuserid);
record.set(2, puid);
record.set(4, "586");
record.set(5, pid);
context.write(null, record);
}
}
我的计划中出错了什么? 我无法理解为什么会出现此错误,因为我的数据不是空的! (是的,我查了一下) 请帮我。感谢。
答案 0 :(得分:0)
在映射器中,您调用context.write(null, record);
这是错误的。如果您不想指定密钥,请使用NullWritable
(更改映射器的声明,驱动程序以反映所使用的新类型,并context.write(null, record);
改为context.write(NullWritable.get(), record);
但是,如果您涉及减速器,这不是最佳解决方案(不是您的情况,而是FYI),请参阅此处了解详情:https://support.pivotal.io/hc/en-us/articles/202810986-Mapper-output-key-value-NullWritable-can-cause-reducer-phase-to-move-slowly