在创建5000行小于1000的随机整数的表后,我正在使用APPROX_SUM查询。
它总是会导致异常
只需要一个参数。
但是我只使用一列,并且整数如下所述。 我当时正在运行shark-withinfo。有人可以给我提示如何解决该问题吗?
shark> DESCRIBE rand5000;
18/10/11 11:57:09 INFO shark.SharkCliDriver: Execution Mode: shark
18/10/11 11:57:10 INFO ql.Driver: <PERFLOG method=Driver.run>
18/10/11 11:57:10 INFO ql.Driver: <PERFLOG method=compile>
18/10/11 11:57:10 INFO parse.ParseDriver: Parsing command: DESCRIBE rand5000
18/10/11 11:57:10 INFO parse.ParseDriver: Parse Completed
18/10/11 11:57:10 INFO parse.DDLSemanticAnalyzer: analyzeDescribeTable done
18/10/11 11:57:10 INFO ql.Driver: Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:col_name, type:string, comment:from deserializer), FieldSchema(name:data_type, type:string, comment:from deserializer), FieldSchema(name:comment, type:string, comment:from deserializer)], properties:null)
18/10/11 11:57:10 INFO ql.Driver: </PERFLOG method=compile start=1539277030001 end=1539277030529 duration=528>
18/10/11 11:57:10 INFO ql.Driver: <PERFLOG method=Driver.execute>
18/10/11 11:57:10 INFO ql.Driver: Starting command: DESCRIBE rand5000
18/10/11 11:57:10 INFO metastore.HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
18/10/11 11:57:10 INFO metastore.ObjectStore: ObjectStore, initialize called
7.700: [GC (Metadata GC Threshold) 355352K->30896K(5024768K), 0.0217271 secs]
7.722: [Full GC (Metadata GC Threshold) 30896K->20373K(5024768K), 0.0616700 secs]
8.237: [GC (System.gc()) 114782K->22795K(5024768K), 0.0052734 secs]
8.243: [Full GC (System.gc()) 22795K->11577K(5024768K), 0.1667079 secs]
8.411: [GC (System.gc()) 48292K->11712K(5024768K), 0.0013864 secs]
8.412: [Full GC (System.gc()) 11712K->10076K(5024768K), 0.0616413 secs]
18/10/11 11:57:13 INFO metastore.ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
18/10/11 11:57:13 INFO metastore.ObjectStore: Initialized ObjectStore
18/10/11 11:57:14 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=rand5000
18/10/11 11:57:14 INFO hive.log: DDL: struct rand5000 { i32 numbers}
18/10/11 11:57:14 INFO exec.DDLTask: DDLTask: got data for rand5000
18/10/11 11:57:14 INFO exec.DDLTask: DDLTask: written data for rand5000
18/10/11 11:57:14 INFO ql.Driver: </PERFLOG method=Driver.execute start=1539277030530 end=1539277034972 duration=4442>
OK
18/10/11 11:57:14 INFO ql.Driver: OK
18/10/11 11:57:14 INFO ql.Driver: <PERFLOG method=releaseLocks>
18/10/11 11:57:14 INFO ql.Driver: </PERFLOG method=releaseLocks start=1539277034972 end=1539277034972 duration=0>
18/10/11 11:57:14 INFO ql.Driver: </PERFLOG method=Driver.run start=1539277030001 end=1539277034973 duration=4972>
18/10/11 11:57:15 INFO mapred.FileInputFormat: Total input paths to process : 1
numbers int
Time taken: 5.078 seconds
18/10/11 11:57:15 INFO CliDriver: Time taken: 5.078 seconds
18/10/11 11:57:15 INFO ql.Driver: <PERFLOG method=releaseLocks>
18/10/11 11:57:15 INFO ql.Driver: </PERFLOG method=releaseLocks start=1539277035078 end=1539277035078 duration=0>
shark> SELECT APPROX_SUM(numbers) FROM rand5000;
18/10/11 11:57:28 INFO shark.SharkCliDriver: Execution Mode: shark
18/10/11 11:57:28 INFO ql.Driver: <PERFLOG method=Driver.run>
18/10/11 11:57:28 INFO ql.Driver: <PERFLOG method=compile>
18/10/11 11:57:28 INFO parse.ParseDriver: Parsing command: SELECT APPROX_SUM(numbers) FROM rand5000
18/10/11 11:57:28 INFO parse.ParseDriver: Parse Completed
18/10/11 11:57:28 INFO parse.SharkSemanticAnalyzer: Get metadata for source tables
18/10/11 11:57:28 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=rand5000
18/10/11 11:57:28 INFO hive.log: DDL: struct rand5000 { i32 numbers}
18/10/11 11:57:28 INFO parse.SharkSemanticAnalyzer: Get metadata for subqueries
18/10/11 11:57:28 INFO parse.SharkSemanticAnalyzer: Get metadata for destination tables
18/10/11 11:57:28 INFO hive.log: DDL: struct rand5000 { i32 numbers}
FAILED: Error in semantic analysis: Exactly one argument is expected.
18/10/11 11:57:28 ERROR shark.SharkDriver: FAILED: Error in semantic analysis: Exactly one argument is expected.
org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException: Exactly one argument is expected.
at org.apache.hadoop.hive.ql.udf.approx.ApproxUDAFSum.getEvaluator(ApproxUDAFSum.java:60)
at org.apache.hadoop.hive.ql.udf.generic.AbstractGenericUDAFResolver.getEvaluator(AbstractGenericUDAFResolver.java:47)
at org.apache.hadoop.hive.ql.exec.FunctionRegistry.getGenericUDAFEvaluator(FunctionRegistry.java:785)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getGenericUDAFEvaluator(SemanticAnalyzer.java:2464)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapGroupByOperator(SemanticAnalyzer.java:2904)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapAggr1MR(SemanticAnalyzer.java:3704)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:6183)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:6820)
at shark.parse.SharkSemanticAnalyzer.analyzeInternal(SharkSemanticAnalyzer.scala:160)
at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:244)
at shark.SharkDriver.compile(SharkDriver.scala:194)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:336)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:895)
at shark.SharkCliDriver.processCmd(SharkCliDriver.scala:294)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:341)
at shark.SharkCliDriver$.main(SharkCliDriver.scala:203)
at shark.SharkCliDriver.main(SharkCliDriver.scala)
18/10/11 11:57:28 INFO ql.Driver: </PERFLOG method=compile start=1539277048550 end=1539277048725 duration=175>
18/10/11 11:57:28 INFO ql.Driver: <PERFLOG method=releaseLocks>
18/10/11 11:57:28 INFO ql.Driver: </PERFLOG method=releaseLocks start=1539277048725 end=1539277048725 duration=0>