我正在尝试从数据ElasticSearch中读取火花?
conf = {"es.resource":"sflow_*/sflow","es.nodes":"ES01","es.query":'some query'}
rdd = sc.newAPIHadoopRDD("org.elasticsearch.hadoop.mr.EsInputFormat", "org.apache.hadoop.io.NullWritable", "org.elasticsearch.hadoop.mr.LinkedMapWritable", conf=conf)
rdd.take(2)
在rdd.take之后(2)该过程将停滞并发出如下所示的警告日志
16/03/14 20:52:07 WARN httpclient.SimpleHttpConnectionManager: SimpleHttpConnectionManager being used
incorrectly. Be sure that HttpMethod.releaseConnection() is always called and that only one thread and/or
method is using this connection manager at a time.
但是使用rdd.first()将始终成功返回结果。你知道为什么吗?