前一段时间,我一直面临着在 Apache Spark 中读取文件的问题。我已经分享了my answer on StackOverflow。
正如@Programmer恰当地指出我没有关闭打开的流。我尝试在takeWhile
(inspiration)
Stream.continually(zis.getNextEntry)
.takeWhile {
case null => zis.close(); false
case _ => true
}
.flatMap { _ =>
val br = new BufferedReader(new InputStreamReader(zis))
Stream.continually(br.readLine())
.takeWhile{
case null => br.close(); false
case _ => true
}
}
但它不起作用!
在阅读zip文件时,我现在收到此错误:
Job aborted due to stage failure: Task 0 in stage 5.0 failed 1 times, most recent failure: Lost task 0.0 in stage 5.0 (TID 20, localhost): java.io.IOException: Stream closed
at java.util.zip.ZipInputStream.ensureOpen(ZipInputStream.java:67)
at java.util.zip.ZipInputStream.getNextEntry(ZipInputStream.java:116)
只需将其打开 - 它就可以了。
所以看起来我正在关闭流然后尝试再次阅读它。但我不知道为什么以及如何解决它。
答案 0 :(得分:0)
关注@ alexandre-dupriez评论我只关闭了外Stream
,这有助于......
Stream.continually(zis.getNextEntry)
.takeWhile {
case null => zis.close(); false
case _ => true
}
.flatMap { _ =>
val br = new BufferedReader(new InputStreamReader(zis))
Stream.continually(br.readLine()).takeWhile(_ != null)
}
因此,我需要一段时间来确认它是否正常工作。