如何在HDP hortonworks上创建具有多列族的hbase表?

时间:2016-09-07 10:04:00

标签: hbase hortonworks-data-platform hortonworks-sandbox

我是HDP的新手,我想创建一个包含多列的hbase表,并从csv文件加载数据,如下所示

csv file

你可以看到,我的每个示例列系列“informations personnelles”包含多个列,例如“nom”“prenom”等等。

所以我的问题是:   - 如何在hdp沙箱上使用java api创建表hbase?   - 如何从我的csv文件加载数据?

ps:我试图创建表但我不知道如何在沙盒上运行它?在哪里放我的java类?我需要配置一些东西吗?

这是我的代码

    import java.io.IOException;

import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.HColumnDescriptor;
import org.apache.hadoop.hbase.HTableDescriptor;
import org.apache.hadoop.hbase.client.HBaseAdmin;
import org.apache.hadoop.hbase.TableName;

import org.apache.hadoop.conf.Configuration;

public class CreateTable {

   public static void main(String[] args) throws IOException {

      // Instantiating configuration class
      Configuration con = HBaseConfiguration.create();
        con.set("hbase.zookeeper.property.clientPort", "2181");
        con.set("hbase.zookeeper.quorum", "hortonworks.hbase.vm");
        con.set("zookeeper.znode.parent", "/hbase-unsecure");
        System.out.println("HBase is running!");

      // Instantiating HbaseAdmin class
      HBaseAdmin admin = new HBaseAdmin(con);

      // Instantiating table descriptor class
      HTableDescriptor tableDescriptor = new
      TableDescriptor(TableName.valueOf("competence"));

      // Adding column families to table descriptor
      tableDescriptor.addFamily(new HColumnDescriptor("Infos_collaborateur"));
      tableDescriptor.addFamily(new HColumnDescriptor("Infos_Rh"));
      tableDescriptor.addFamily(new HColumnDescriptor("Savoir_faire"));
      tableDescriptor.addFamily(new HColumnDescriptor("Savoir_etre"));
      tableDescriptor.addFamily(new HColumnDescriptor("Langues"));
      tableDescriptor.addFamily(new HColumnDescriptor("Java:Developpement/Librairies/API/Frameworks/CMS"));
      tableDescriptor.addFamily(new HColumnDescriptor("PHP/Frameworks"));
      tableDescriptor.addFamily(new HColumnDescriptor("Techno_Web/Frameworks"));
      tableDescriptor.addFamily(new HColumnDescriptor("Autres"));
      tableDescriptor.addFamily(new HColumnDescriptor("ERP:Language/Outils"));
      tableDescriptor.addFamily(new HColumnDescriptor("Mobile:natif"));
      tableDescriptor.addFamily(new HColumnDescriptor("Mobile:Cross"));
      tableDescriptor.addFamily(new HColumnDescriptor("Infographie/creas"));
      tableDescriptor.addFamily(new HColumnDescriptor("Outils_de_developpement/Software"));
      tableDescriptor.addFamily(new HColumnDescriptor("Analytics"));
      tableDescriptor.addFamily(new HColumnDescriptor("Outils_Microsoft"));
      tableDescriptor.addFamily(new HColumnDescriptor("Developpements/Librairies"));
      tableDescriptor.addFamily(new HColumnDescriptor("BaseDeDonnees/FluxDeDonnees"));
      tableDescriptor.addFamily(new HColumnDescriptor("Windows:SystemeDexploitation/serveur"));
      tableDescriptor.addFamily(new HColumnDescriptor("AutresOS"));
      tableDescriptor.addFamily(new HColumnDescriptor("Plateforms"));
      tableDescriptor.addFamily(new HColumnDescriptor("Serveur_web_parametrage"));
      tableDescriptor.addFamily(new HColumnDescriptor("Serveur_Application_parametrage"));
      tableDescriptor.addFamily(new HColumnDescriptor("Integration/fonctionnel"));
      tableDescriptor.addFamily(new HColumnDescriptor("Outils_de_conception/de_gestion_projet"));
      tableDescriptor.addFamily(new HColumnDescriptor("AMOA"));
      tableDescriptor.addFamily(new HColumnDescriptor("Experience"));   
      tableDescriptor.addFamily(new HColumnDescriptor("Interventions"));            

      // Execute the table through admin
      admin.createTable(tableDescriptor);
      System.out.println(" Table created ");
   }
}

感谢您的进步

1 个答案:

答案 0 :(得分:0)

如果您尝试从本地计算机运行java程序以连接到sandbox hbase和zookeeper,那么您需要在沙箱设置中执行2181端口的端口转发>网络>高级>转发端口。提供任何名称,如zk,协议:TCP,历史IP:127.0.0.1,主机端口:2181,访客端口:2181。然后在您的程序中设置conf如下并运行程序:

con.set("hbase.zookeeper.property.clientPort", "2181");
con.set("hbase.zookeeper.quorum", "127.0.0.1");

在您的java程序中,您可以使用scanner api读取csv文件作为参考http://www.journaldev.com/2335/read-csv-file-java-scanner,并使用java hbase api存储数据以存储数据,以供参考https://autofei.wordpress.com/2012/04/02/java-example-code-using-hbase-data-model-operations/

其他选项是将您的文件和java程序jar发送到沙箱并在那里运行。要复制或ssh到沙箱,你需要进行端口转发,如上所示给主机端口:2222,访客端口:22

希望这可以帮助你...