从java中的csv将数据导入快照数据时出错

时间:2018-08-06 10:14:38

标签: java scala snappydata

我在scala中的表架构是

snSession.sql(“创建表category_subscriber(id int,catId int,brandId int,domains int,osId int,rType int,rTime int,ctId int,icmpId int,setId int,rAt int,cyId int)使用列选项(BUCKETS'5',PARTITION_BY'ID',OVERFLOW'true',EVICTION_BY'LRUHEAPPERCENT')“);

我在Java中的代码是

Statement statement = snappy.createStatement();
            statement.execute("CREATE EXTERNAL TABLE CATEGORY_SUBSCRIBER USING com.databricks.spark.csv OPTIONS(path '/home/sys1010/Desktop/category_sub.csv', header 'true', inferSchema 'true', nullValue 'NULL', maxCharsPerColumn '4096';");

通过Java将数据从csv导入snappydata时出现错误

INFO: Starting client on '172.16.20.28' with ID='1965|2018/08/06 15:38:58.573 IST' Source-Revision=e6cfbfdb0f14ee87261381934075b7f37672a99d
Aug 06, 2018 3:38:59 PM snappydump.SnappyOps upsert
SEVERE: null
java.sql.SQLException: (SQLState=42X01 Severity=20000) (Server=172.16.20.28/172.16.20.28[1528] Thread=ThriftProcessor-3) Syntax error: org.apache.spark.sql.ParseException: Invalid input 'U', expected tableSchema or 'EOI' (line 1, column 1):
USING com.databricks.spark.csv OPTIONS(path '/home/sys1010/Desktop/category_sub.csv', header 'true', inferSchema 'true', nullValue 'NULL', maxCharsPerColumn '4096'
^;;.
    at io.snappydata.thrift.SnappyDataService$execute_result$execute_resultStandardScheme.read(SnappyDataService.java:7033)
    at io.snappydata.thrift.SnappyDataService$execute_result$execute_resultStandardScheme.read(SnappyDataService.java:7010)
    at io.snappydata.thrift.SnappyDataService$execute_result.read(SnappyDataService.java:6949)
    at io.snappydata.org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86)
    at io.snappydata.thrift.SnappyDataService$Client.recv_execute(SnappyDataService.java:256)
    at io.snappydata.thrift.SnappyDataService$Client.execute(SnappyDataService.java:239)
    at io.snappydata.thrift.internal.ClientService.execute(ClientService.java:889)
    at io.snappydata.thrift.internal.ClientStatement.execute(ClientStatement.java:720)
    at io.snappydata.thrift.internal.ClientStatement.execute(ClientStatement.java:371)
    at snappydump.SnappyOps.upsert(SnappyOps.java:29)
    at snappydump.SnappyDump.menu(SnappyDump.java:51)
    at snappydump.SnappyDump.main(SnappyDump.java:39)
Caused by: java.rmi.ServerException: Server STACK: java.sql.SQLSyntaxErrorException(42X01): Syntax error: org.apache.spark.sql.ParseException: Invalid input 'U', expected tableSchema or 'EOI' (line 1, column 1):
USING com.databricks.spark.csv OPTIONS(path '/home/sys1010/Desktop/category_sub.csv', header 'true', inferSchema 'true', nullValue 'NULL', maxCharsPerColumn '4096'
^;;.
    at com.pivotal.gemfirexd.internal.iapi.error.StandardException.newException(StandardException.java:214)
    at com.pivotal.gemfirexd.internal.engine.Misc.processFunctionException(Misc.java:776)
    at com.pivotal.gemfirexd.internal.engine.Misc.processFunctionException(Misc.java:757)
    at com.pivotal.gemfirexd.internal.engine.sql.execute.SnappySelectResultSet.setup(SnappySelectResultSet.java:284)
    at com.pivotal.gemfirexd.internal.engine.distributed.message.GfxdFunctionMessage.executeFunction(GfxdFunctionMessage.java:332)
    at com.pivotal.gemfirexd.internal.engine.distributed.message.GfxdFunctionMessage.executeFunction(GfxdFunctionMessage.java:274)
    at com.pivotal.gemfirexd.internal.engine.sql.execute.SnappyActivation.executeOnLeadNode(SnappyActivation.java:338)
    at com.pivotal.gemfirexd.internal.engine.sql.execute.SnappyActivation.executeWithResultSet(SnappyActivation.java:202)
    at com.pivotal.gemfirexd.internal.engine.sql.execute.SnappyActivation.execute(SnappyActivation.java:158)
    at com.pivotal.gemfirexd.internal.impl.sql.GenericActivationHolder.execute(GenericActivationHolder.java:462)
    at com.pivotal.gemfirexd.internal.impl.sql.GenericPreparedStatement.execute(GenericPreparedStatement.java:586)
    at com.pivotal.gemfirexd.internal.impl.jdbc.EmbedStatement.executeStatement(EmbedStatement.java:2175)
    at com.pivotal.gemfirexd.internal.impl.jdbc.EmbedStatement.execute(EmbedStatement.java:1289)
    at com.pivotal.gemfirexd.internal.impl.jdbc.EmbedStatement.execute(EmbedStatement.java:1006)
    at com.pivotal.gemfirexd.internal.impl.jdbc.EmbedStatement.execute(EmbedStatement.java:972)
    at io.snappydata.thrift.server.SnappyDataServiceImpl.execute(SnappyDataServiceImpl.java:1704)
    at io.snappydata.thrift.SnappyDataService$Processor$execute.getResult(SnappyDataService.java:1511)
    at io.snappydata.thrift.SnappyDataService$Processor$execute.getResult(SnappyDataService.java:1495)
    at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
    at io.snappydata.thrift.server.SnappyDataServiceImpl$Processor.process(SnappyDataServiceImpl.java:201)
    at io.snappydata.thrift.server.SnappyThriftServerThreadPool$WorkerProcess.run(SnappyThriftServerThreadPool.java:270)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at io.snappydata.thrift.server.SnappyThriftServer$1.lambda$newThread$0(SnappyThriftServer.java:143)
    at java.lang.Thread.run(Thread.java:748)

csv中的数据由制表符空格分隔,就像

59314315    22  0   50  0   4   1531506600  0   87152   0   1531582029  0   2018-07-31
53865527    22  0   50  0   4   1531506600  0   87152   0   1531582037  0   2018-07-31
42637344    22  0   50  0   4   1531506600  0   87122   0   1531582142  0   2018-07-31
20501400    22  0   50  0   4   1531506600  0   87122   0   1531582263  0   2018-07-31
17067216    22  0   50  0   4   1531506600  0   87122   0   1531582291  0   2018-07-31
70845365    22  0   50  0   4   1531506600  0   86362   0   1531582308  0   2018-07-31
83702601    22  0   50  0   4   1531506600  0   87122   0   1531582373  0   2018-07-31

有人可以帮我吗

2 个答案:

答案 0 :(得分:0)

该语句中存在语法错误,更正后的语句为:

Statement语句= snappy.createStatement();             statement.execute(“使用com.databricks.spark.csv选项创建外部表CATEGORY_SUBSCRIBER(路径'/home/sys1010/Desktop/category_sub.csv',标头'true',inferSchema'true',nullValue'NULL',maxCharsPerColumn' 4096')“);

答案 1 :(得分:0)

使用CSV选项(....)创建外部表<name>也应起作用。 CSV现在是内置数据源。