我想在Akka Stream中实现自定义Source[ByteSting]
。该源应该只读取所提供文件中的数据并在提供的字节范围内,并将其传播到下游。
起初我想,这可以通过实现混合在ActorPublisher中的Actor来完成。此实现类似于akka.stream.impl.io.FilePublisher
,它从提供的路径读取整个文件,而不仅仅是来自给定字节范围的数据:
import java.nio.ByteBuffer
import java.nio.channels.FileChannel
import java.nio.file.{Path, StandardOpenOption}
import akka.actor.{ActorLogging, DeadLetterSuppression, Props}
import akka.stream.actor.ActorPublisher
import akka.stream.actor.ActorPublisherMessage.{Cancel, Request}
import akka.util.ByteString
import scala.annotation.tailrec
import scala.util.control.NonFatal
class FilePublisher(pathToFile: Path, startByte: Long, endByte: Long) extends ActorPublisher[ByteString]
with ActorLogging{
import FilePublisher._
private val chunksToBuffer = 10
private var bytesLeftToRead = endByte - startByte + 1
private var fileChannel: FileChannel = _
private val buffer = ByteBuffer.allocate(8096)
private var bufferedChunks: Vector[ByteString] = _
override def preStart(): Unit = {
try {
log.info("Starting")
fileChannel = FileChannel.open(pathToFile, StandardOpenOption.READ)
bufferedChunks = readAhead(Vector.empty, Some(startByte))
log.info("Chunks {}", bufferedChunks)
} catch {
case NonFatal(ex) => onErrorThenStop(ex)
}
}
override def postStop(): Unit = {
log.info("Stopping")
if (fileChannel ne null)
try fileChannel.close() catch {
case NonFatal(ex) => log.error(ex, "Error during file channel close")
}
}
override def receive: Receive = {
case Request =>
readAndSignalNext()
log.info("Got request")
case Continue =>
log.info("Continuing reading")
readAndSignalNext()
case Cancel =>
log.info("Cancel message got")
context.stop(self)
}
private def readAndSignalNext() = {
log.info("Reading and signaling")
if (isActive) {
bufferedChunks = readAhead(signalOnNext(bufferedChunks), None)
if (isActive && totalDemand > 0) self ! Continue
}
}
@tailrec
private def signalOnNext(chunks: Vector[ByteString]): Vector[ByteString] = {
if (chunks.nonEmpty && totalDemand > 0) {
log.info("Signaling")
onNext(chunks.head)
signalOnNext(chunks.tail)
} else {
if (chunks.isEmpty && bytesLeftToRead > 0) {
onCompleteThenStop()
}
chunks
}
}
@tailrec
private def readAhead(currentlyBufferedChunks: Vector[ByteString], startPosition: Option[Long]): Vector[ByteString] = {
if (currentlyBufferedChunks.size < chunksToBuffer) {
val bytesRead = readDataFromChannel(startPosition)
log.info("Bytes read {}", bytesRead)
bytesRead match {
case Int.MinValue => Vector.empty
case -1 =>
log.info("EOF reached")
currentlyBufferedChunks // EOF reached
case _ =>
buffer.flip()
val chunk = ByteString(buffer)
buffer.clear()
bytesLeftToRead -= bytesRead
val trimmedChunk = if (bytesLeftToRead >= 0) chunk else chunk.dropRight(bytesLeftToRead.toInt)
readAhead(currentlyBufferedChunks :+ trimmedChunk, None)
}
} else {
currentlyBufferedChunks
}
}
private def readDataFromChannel(startPosition: Option[Long]): Int = {
try {
startPosition match {
case Some(position) => fileChannel.read(buffer, position)
case None => fileChannel.read(buffer)
}
} catch {
case NonFatal(ex) =>
log.error(ex, "Got error reading data from file channel")
Int.MinValue
}
}
}
object FilePublisher {
private case object Continue extends DeadLetterSuppression
def props(path: Path, startByte: Long, endByte: Long): Props = Props(classOf[FilePublisher], path, startByte, endByte)
}
但事实证明,当我实现我Source
支持的FilePublisher
时,这样:
val fileSource = Source.actorPublisher(FilePublisher.props(pathToFile, 0, fileLength))
val future = fileSource.runWith(Sink.seq)
没有任何反应,来源也不会向下游传播数据。
是否有其他正确的方法可以根据我的Source
实现FilePublisher
,或者我是否应该使用此API并且只实现自定义处理阶段,如here所述?
CustomStage方法的问题在于,它的简单实现将在此阶段立即执行IO。我想,我可以将IO从舞台移动到自定义线程池或actor,但这需要舞台和actor之间的某种形式的同步。 感谢。
答案 0 :(得分:0)
我注意到您目前没有使用单独的调度程序进行IO操作。 Here's文档部分解释了为什么不这样做可能会导致您的应用程序出现令人讨厌的阻塞。
Akka Streams使用特定的基于线程池的调度程序将FilePublisher
包装在FileSource
中。您可以查看他们的代码以获取灵感here。
答案 1 :(得分:0)
问题是由receive
方法的模式匹配错误引起的:
这一行case Request =>
应该是case Request(_)
,因为Request
实际上是具有单个参数(final case class Request(n: Long)
)的案例类,而不是我认为的案例对象。