我在linux上使用2.1的spark shell。
./bin/spark-shell --packages org.apache.spark:spark-sql-kafka-0-10_2.11:2.1.0
Spark shell启动没有任何问题。
val ds1 = spark.readStream.option("kafka.bootstrap.servers", "xx.xx.xxx.xxx:9092,xx.xx.xxx.xxx:9092").option("subscribe", "MickyMouse").load()
我收到以下异常
java.lang.IllegalArgumentException: 'path' is not specified
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$9.apply(DataSource.scala:205)
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$9.apply(DataSource.scala:205)
at scala.collection.MapLike$class.getOrElse(MapLike.scala:128)
at org.apache.spark.sql.catalyst.util.CaseInsensitiveMap.getOrElse(CaseInsensitiveMap.scala:23)
at org.apache.spark.sql.execution.datasources.DataSource.sourceSchema(DataSource.scala:204)
at org.apache.spark.sql.execution.datasources.DataSource.sourceInfo$lzycompute(DataSource.scala:87)
at org.apache.spark.sql.execution.datasources.DataSource.sourceInfo(DataSource.scala:87)
at org.apache.spark.sql.execution.streaming.StreamingRelation$.apply(StreamingRelation.scala:30)
at org.apache.spark.sql.streaming.DataStreamReader.load(DataStreamReader.scala:124)
The kafka server is up and running.
任何想法我如何成功地从kafka源读取。
答案 0 :(得分:0)
您忘记调用format
方法。默认格式为parquet
。这就是它寻找路径的原因。将代码更改为spark.readStream.format("kafka").option...
可以解决此问题。
答案 1 :(得分:0)
这应该可以解决问题:
let myRequest = NSMutableURLRequest(url: URL(string: "https://news.ycombinator.com/item?id=19722704")!)
let dataTask : URLSessionTask = URLSession.shared.dataTask(with: myRequest as URLRequest, completionHandler: { data, response, error in
guard error == nil else {
return
}
guard let data = data else {
return
}
if let htmlString = String(bytes: data, encoding: String.Encoding.utf8), let doc = try? HTML(html: htmlString, encoding: .utf8) {
for postDescription in doc.xpath("//*[@id=\"hnmain\"]//tr[3]/td/table[1]//tr[4]/td[2]") {
print("postDescription: \(String(describing: postDescription.content))")
}
for comment in doc.xpath("//table[@class=\"comment-tree\"]//tr") {
print("Comment: \(String(describing: comment.content))")
}
}
})
dataTask.resume()
我知道现在做出回应已经太迟了,但可能会帮助一些遇到类似问题的人。