我正在尝试使用apache phoenix spark plugin:df.saveToPhoenix(tableName, zkUrl = Some(quorumAddress))
将jsonRDD保存到hbase中。该表看起来像:
CREATE TABLE IF NOT EXISTS person (
ID BIGINT NOT NULL PRIMARY KEY,
NAME VARCHAR,
SURNAME VARCHAR) SALT_BUCKETS = 40, COMPRESSION='GZ';
我在这种表格中有大约100,000到2,000,000条记录。其中一些正常保存。但其中一些失败了,错误:
java.lang.RuntimeException: org.apache.phoenix.exception.PhoenixIOException:
callTimeout=1200000, callDuration=2902566: row 'PERSON' on table 'SYSTEM.CATALOG' at
region=SYSTEM.CATALOG,,1443172839381.a593d4dbac97863f897bca469e8bac66.,
hostname=hadoop-02,16020,1443292360474, seqNum=339
at org.apache.phoenix.mapreduce.PhoenixRecordWriter.close(PhoenixRecordWriter.java:62)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1$$anonfun$12$$anonfun$apply$5.apply$mcV$sp(PairRDDFunctions.scala:1043)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1294)
这意味着什么?有没有其他方法可以将数据从DataFrame批量插入到hbase?