我正在使用Spark Streaming从Twitter中提取推文并处理它们。在Eclipse中使用master设置为local [*]执行我的应用程序它可以工作,但是当我将它提交给服务器中的Spark时,我收到以下错误:
没有主持人的路线。可在互联网上找到相关讨论:http://www.google.co.jp/search?q=944a924a或http://www.google.co.jp/search?q=24fd66eb
我的代码:
SparkConf sparkConf = new SparkConf().setAppName("AppStreaming");
// local para ejecutar spark en local
if(local){
sparkConf.setMaster("local[*]");
}
// set proxy
System.setProperty("http.proxyHost", PROXY_HOST);
System.setProperty("http.proxyPort", PROXY_PORT);
System.setProperty("https.proxyHost", PROXY_HOST);
System.setProperty("https.proxyPort", PROXY_PORT);
JavaStreamingContext streamingContext = new JavaStreamingContext(sparkConf, Seconds.apply(5)); // extraer nuevos tweets cada 5 segundos
// conectar a Twitter..
ConfigurationBuilder cb=new ConfigurationBuilder();
cb.setDebugEnabled(true)
.setOAuthConsumerKey(CONSUMER_KEY)
.setOAuthConsumerSecret(CONSUMER_SECRET)
.setOAuthAccessToken(ACCESS_TOKEN)
.setOAuthAccessTokenSecret(ACCESS_TOKEN_SECRET)
.setHttpConnectionTimeout(100000)
.setUseSSL(true);
// proxy
cb.setHttpProxyHost(PROXY_HOST)
.setHttpProxyPort(PROXY_PORT);
TwitterFactory factory = new TwitterFactory(cb.build());
Twitter twitter = factory.getInstance();
Authorization authorization = twitter.getAuthorization();
final String[] filters = new String[]{"credit card"};
JavaReceiverInputDStream<Status> stream = TwitterUtils.createStream(streamingContext, authorization, filters );
JavaDStream<Status> results = stream.filter(new Function<Status, Boolean>(){
private static final long serialVersionUID = 1L;
@Override
public Boolean call(Status tweet) {
myFunction();
}
});
results.print();
// start streaming...
streamingContext.start();
streamingContext.awaitTermination();
控制台输出:
Time: 1441269605000 ms
-------------------------------------------
15/09/03 10:40:05 INFO JobScheduler: Finished job streaming job 1441269605000 ms.0 from job set of time 1441269605000 ms
15/09/03 10:40:05 INFO JobScheduler: Total delay: 0,009 s for time 1441269605000 ms (execution: 0,001 s)
15/09/03 10:40:05 INFO JobScheduler: Added jobs for time 1441269605000 ms
15/09/03 10:40:05 INFO MapPartitionsRDD: Removing RDD 173 from persistence list
15/09/03 10:40:05 INFO BlockManager: Removing RDD 173
15/09/03 10:40:05 INFO BlockRDD: Removing RDD 172 from persistence list
15/09/03 10:40:05 INFO BlockManager: Removing RDD 172
15/09/03 10:40:05 INFO TwitterInputDStream: Removing blocks of RDD BlockRDD[172] at createStream at MoriartyStreaming.java:216 of time 1441269605000 ms
15/09/03 10:40:05 INFO ReceivedBlockTracker: Deleting batches ArrayBuffer(1441269595000 ms)
15/09/03 10:40:05 INFO InputInfoTracker: remove old batch metadata: 1441269595000 ms
15/09/03 10:40:05 INFO ReceiverTracker: Registered receiver for stream 0 from 7.121.100.47:48986
15/09/03 10:40:05 ERROR ReceiverTracker: Deregistered receiver for stream 0: Restarting receiver with delay 2000ms: Error receiving tweets - No existe ninguna ruta hasta el «host»
Relevant discussions can be found on the Internet at:
http://www.google.co.jp/search?q=944a924a or
http://www.google.co.jp/search?q=24fd66eb
TwitterException{exceptionCode=[944a924a-24fd66eb 944a924a-24fd66c1], statusCode=-1, message=null, code=-1, retryAfter=-1, rateLimitStatus=null, version=3.0.3}
at twitter4j.internal.http.HttpClientImpl.request(HttpClientImpl.java:192)
at twitter4j.internal.http.HttpClientWrapper.request(HttpClientWrapper.java:61)
at twitter4j.internal.http.HttpClientWrapper.post(HttpClientWrapper.java:98)
at twitter4j.TwitterStreamImpl.getFilterStream(TwitterStreamImpl.java:304)
at twitter4j.TwitterStreamImpl$7.getStream(TwitterStreamImpl.java:292)
at twitter4j.TwitterStreamImpl$TwitterStreamConsumer.run(TwitterStreamImpl.java:462)
Caused by: java.net.NoRouteToHostException: No existe ninguna ruta hasta el «host»
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:625)
at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
at sun.net.www.protocol.https.HttpsClient.<init>(HttpsClient.java:264)
at sun.net.www.protocol.https.HttpsClient.New(HttpsClient.java:367)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(AbstractDelegateHttpsURLConnection.java:191)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:933)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:177)
at sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1092)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getOutputStream(HttpsURLConnectionImpl.java:250)
at twitter4j.internal.http.HttpClientImpl.request(HttpClientImpl.java:150)
... 5 more
的pom.xml:
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-twitter_2.10</artifactId>
<version>1.1.0</version>
</dependency>
我的应用程序在本地和服务器的代理后面运行。
有任何关于我收到此错误的建议吗?
答案 0 :(得分:0)
我一直在努力解决同样的错误,直到我将其添加到我的.bashrc文件中。
export _JAVA_OPTIONS='-Dhttp.proxyHost=www.xxxx.net -Dhttp.proxyPort=8080 -Dtwitter4j.http.proxyHost=www.xxxx.net -Dtwitter4j.http.proxyPort=8080 -Dtwitter4j.https.proxyHost=www.xxxx.net -Dtwitter4j.https.proxyPort=8080'