所以今天我尝试使用Java执行我的第一个spark流式脚本。 目前我所要做的就是使用spark spark Twitter API获取推文流并过滤它。
将我的程序导出到jar文件(simplestream.jar)后,我确实运行了这个命令: spark-submit simplestream.jar
结果:
Exception in thread "main" java.lang.UnsupportedClassVersionError:simplestream/Main : Unsupported major.minor version 52.0
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at org.apache.spark.util.Utils$.classForName(Utils.scala:173)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:639)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
经过一些研究后,我发现通过切换到jre7可以解决问题,但是我不能这样做,因为我在我的代码中只使用版本8及以上版本支持lambda表达式:
package simplestream;
import java.util.Arrays;
import org.apache.spark.SparkConf;
import org.apache.spark.streaming.Duration;
import org.apache.spark.streaming.api.java.JavaStreamingContext;
import org.apache.spark.streaming.twitter.TwitterUtils;
import twitter4j.auth.Authorization;
import twitter4j.auth.AuthorizationFactory;
import twitter4j.conf.Configuration;
import twitter4j.conf.ConfigurationContext;
public class Main {
public static void main(String[] args) {
// Twitter4J
// IMPORTANT: adjust your API keys in the twitter4J.properties file
Configuration twitterConf = ConfigurationContext.getInstance();
Authorization twitterAuth = AuthorizationFactory.getInstance(twitterConf);
// Spark
SparkConf sparkConf = new SparkConf()
.setAppName("Tweets Android")
.setMaster("local[2]");
JavaStreamingContext sc = new JavaStreamingContext(sparkConf, new Duration(5000));
// basic stats on tweets
String[] filters = { "#Android" };
TwitterUtils.createStream(sc, twitterAuth, filters)
.flatMap(s -> Arrays.asList(s.getHashtagEntities()))
.map(h -> h.getText().toLowerCase())
.filter(h -> !h.equals("android"))
.countByValue()
.print();
sc.start();
sc.awaitTermination();
}
}
我在cloudera vm上使用eclipse,编译器级别:1.7& JRE:1.8