我是HDP的新手,我想创建一个包含多列的hbase表,并从csv文件加载数据,如下所示
你可以看到,我的每个示例列系列“informations personnelles”包含多个列,例如“nom”“prenom”等等。
所以我的问题是: - 如何在hdp沙箱上使用java api创建表hbase? - 如何从我的csv文件加载数据?
ps:我试图创建表但我不知道如何在沙盒上运行它?在哪里放我的java类?我需要配置一些东西吗?
这是我的代码
import java.io.IOException;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.HColumnDescriptor;
import org.apache.hadoop.hbase.HTableDescriptor;
import org.apache.hadoop.hbase.client.HBaseAdmin;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.conf.Configuration;
public class CreateTable {
public static void main(String[] args) throws IOException {
// Instantiating configuration class
Configuration con = HBaseConfiguration.create();
con.set("hbase.zookeeper.property.clientPort", "2181");
con.set("hbase.zookeeper.quorum", "hortonworks.hbase.vm");
con.set("zookeeper.znode.parent", "/hbase-unsecure");
System.out.println("HBase is running!");
// Instantiating HbaseAdmin class
HBaseAdmin admin = new HBaseAdmin(con);
// Instantiating table descriptor class
HTableDescriptor tableDescriptor = new
TableDescriptor(TableName.valueOf("competence"));
// Adding column families to table descriptor
tableDescriptor.addFamily(new HColumnDescriptor("Infos_collaborateur"));
tableDescriptor.addFamily(new HColumnDescriptor("Infos_Rh"));
tableDescriptor.addFamily(new HColumnDescriptor("Savoir_faire"));
tableDescriptor.addFamily(new HColumnDescriptor("Savoir_etre"));
tableDescriptor.addFamily(new HColumnDescriptor("Langues"));
tableDescriptor.addFamily(new HColumnDescriptor("Java:Developpement/Librairies/API/Frameworks/CMS"));
tableDescriptor.addFamily(new HColumnDescriptor("PHP/Frameworks"));
tableDescriptor.addFamily(new HColumnDescriptor("Techno_Web/Frameworks"));
tableDescriptor.addFamily(new HColumnDescriptor("Autres"));
tableDescriptor.addFamily(new HColumnDescriptor("ERP:Language/Outils"));
tableDescriptor.addFamily(new HColumnDescriptor("Mobile:natif"));
tableDescriptor.addFamily(new HColumnDescriptor("Mobile:Cross"));
tableDescriptor.addFamily(new HColumnDescriptor("Infographie/creas"));
tableDescriptor.addFamily(new HColumnDescriptor("Outils_de_developpement/Software"));
tableDescriptor.addFamily(new HColumnDescriptor("Analytics"));
tableDescriptor.addFamily(new HColumnDescriptor("Outils_Microsoft"));
tableDescriptor.addFamily(new HColumnDescriptor("Developpements/Librairies"));
tableDescriptor.addFamily(new HColumnDescriptor("BaseDeDonnees/FluxDeDonnees"));
tableDescriptor.addFamily(new HColumnDescriptor("Windows:SystemeDexploitation/serveur"));
tableDescriptor.addFamily(new HColumnDescriptor("AutresOS"));
tableDescriptor.addFamily(new HColumnDescriptor("Plateforms"));
tableDescriptor.addFamily(new HColumnDescriptor("Serveur_web_parametrage"));
tableDescriptor.addFamily(new HColumnDescriptor("Serveur_Application_parametrage"));
tableDescriptor.addFamily(new HColumnDescriptor("Integration/fonctionnel"));
tableDescriptor.addFamily(new HColumnDescriptor("Outils_de_conception/de_gestion_projet"));
tableDescriptor.addFamily(new HColumnDescriptor("AMOA"));
tableDescriptor.addFamily(new HColumnDescriptor("Experience"));
tableDescriptor.addFamily(new HColumnDescriptor("Interventions"));
// Execute the table through admin
admin.createTable(tableDescriptor);
System.out.println(" Table created ");
}
}
感谢您的进步
答案 0 :(得分:0)
如果您尝试从本地计算机运行java程序以连接到sandbox hbase和zookeeper,那么您需要在沙箱设置中执行2181端口的端口转发>网络>高级>转发端口。提供任何名称,如zk,协议:TCP,历史IP:127.0.0.1,主机端口:2181,访客端口:2181。然后在您的程序中设置conf如下并运行程序:
con.set("hbase.zookeeper.property.clientPort", "2181");
con.set("hbase.zookeeper.quorum", "127.0.0.1");
在您的java程序中,您可以使用scanner api读取csv文件作为参考http://www.journaldev.com/2335/read-csv-file-java-scanner,并使用java hbase api存储数据以存储数据,以供参考https://autofei.wordpress.com/2012/04/02/java-example-code-using-hbase-data-model-operations/
其他选项是将您的文件和java程序jar发送到沙箱并在那里运行。要复制或ssh到沙箱,你需要进行端口转发,如上所示给主机端口:2222,访客端口:22
希望这可以帮助你...