我正在使用Spark SQL解析JSON并且它运行得很好,它找到了架构并且我正在使用它进行查询。
现在我需要"扁平" JSON和我在论坛中读到最好的方法是使用Hive(Lateral View)进行爆炸,所以我尝试用它来做同样的事情。但我甚至无法创建上下文... Spark给了我一个错误,我无法找到解决方法。
正如我所说,此时我只想创建de context:
println ("Create Spark Context:")
val sc = new SparkContext( "local", "Simple", "$SPARK_HOME")
println ("Create Hive context:")
val hiveContext = new HiveContext(sc)
它给了我这个错误:
Create Spark Context:
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
15/12/26 15:13:44 INFO Remoting: Starting remoting
15/12/26 15:13:45 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@192.168.80.136:40624]
Create Hive context:
15/12/26 15:13:50 INFO Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
15/12/26 15:13:50 INFO Persistence: Property datanucleus.cache.level2 unknown - will be ignored
15/12/26 15:13:56 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
15/12/26 15:13:56 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
15/12/26 15:13:58 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
15/12/26 15:13:58 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
15/12/26 15:13:59 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so does not have its own datastore table.
15/12/26 15:14:01 INFO Persistence: Property datanucleus.cache.level2 unknown - will be ignored
15/12/26 15:14:01 INFO Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:525)
at org.apache.spark.sql.hive.client.IsolatedClientLoader.liftedTree1$1(IsolatedClientLoader.scala:183)
at org.apache.spark.sql.hive.client.IsolatedClientLoader.<init>(IsolatedClientLoader.scala:179)
at org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:226)
at org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:185)
at org.apache.spark.sql.hive.HiveContext.setConf(HiveContext.scala:392)
at org.apache.spark.sql.hive.HiveContext.defaultOverrides(HiveContext.scala:174)
at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:177)
at pebd.emb.Bicing$.main(Bicing.scala:73)
at pebd.emb.Bicing.main(Bicing.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144)
Caused by: java.lang.OutOfMemoryError: PermGen space
Process finished with exit code 1
我知道这是一个非常简单的问题,但我真的不知道这个错误的原因。 提前感谢大家。
答案 0 :(得分:6)
以下是例外的相关部分:
Caused by: java.lang.OutOfMemoryError: PermGen space
您需要增加为JVM提供的PermGen内存量。默认情况下(SPARK-1879),Spark自己的启动脚本将此值增加到128 MB,所以我认为你必须在IntelliJ运行配置中做类似的事情。尝试将-XX:MaxPermSize=128m
添加到“VM选项”列表中。