Hadoop-Vertica Connector:我可以在Map程序中写入Vertica吗?

时间:2014-10-28 16:39:06

标签: hadoop vertica mapr

我正在使用Hadoop-Vertica Connector将大文件导入Vertica。我试图在没有Reducer的情况下使用hadoop来做到这一点。但是在映射过程中,vertica输出表似乎无法初始化,始终存在错误。

当我检查文档时,它没有说我们可以在映射期间写入Vertica,所以我想知道我们是否可以这样做?

谢谢!

修改

这是Hadoop Vertica Connector的文件。

错误:

java.io.IOException: Cannot set record by name if names not initialized
at com.vertica.hadoop.VerticaRecord.set(VerticaRecord.java:270)
at com.vertica.hadoop.VerticaWordCount$TokenizerMapper.map(VerticaWordCount.java:92)
at com.vertica.hadoop.VerticaWordCount$TokenizerMapper.map(VerticaWordCount.java:60)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doA

检查VerticaWordCount.java的源代码,我发现输出表的名称列表根本没有初始化。

这是我在run()中的配置:

  Job job = new Job(conf, "vertica hadoop");
  conf = job.getConfiguration();
  conf.set("mapreduce.job.tracker", "local");

  //job.setInputFormatClass(VerticaInputFormat.class);
  //You have to set the MapOutputKeyClass and MapOutputValueClass, 
  //since by default it will be the same as the class of Reducer's
  //Output Key and Value
  job.setMapOutputKeyClass(Text.class);
  job.setMapOutputValueClass(VerticaRecord.class);

  /*************Settings for Vertica output************************/
  //Set the output format of Reduce class. 
  //I will output VerticaRecords that will be stored in the database
  job.setOutputKeyClass(Text.class);
  job.setOutputValueClass(VerticaRecord.class);

  //Tell Hadoop to send its output to the Vertica
  job.setOutputFormatClass(VerticaOutputFormat.class);
  /****************************************************************/

  job.setJarByClass(VerticaWordCount.class);
  job.setMapperClass(TokenizerMapper.class);
  FileInputFormat.addInputPath(job, new Path("/user/tmp/input"));

  /******************************************************************/
  //Defining the output table
  //VerticaOutputFormat.setOutput(jobObject, tableName, [truncate, ["columnName1 dataType1" [,"columnNamen dataTypen" ...]] );
  VerticaOutputFormat.setOutput(job, "target", true, "a int", "b varchar", "c varchar");

0 个答案:

没有答案