我计划使用Java来处理Markdown文本文件,这些文件在YAML格式的文档开头指定其他元信息,如标题,作者,创建日期等。这是一个例子:
---
title: An example document
author: Paul
created: 2013-05-19
---
The _body_ of this document is
written in **Markdown**.
为了解析YAML数据,我可以使用snakeyaml。据我所知,您可以通过方法java.io.InputStream
和java.io.Reader
从String
,yaml.load()
或yaml.loadAll()
加载YAML文档(请参阅{ {3}}和the SnakeYAML documentation)。
我不想使用从String
读取的版本,因为这会导致大文件出现性能问题。但是将文件作为InputStream
使用失败,因为该流不代表有效的YAML文档。只有流的第一部分代表有效文档。
所以我的问题是:如何使用java.io.FilterInputStream
/ java.io.FilterReader
或其他方法生成流,在第二个---
之后停止,以便整个流有效YAML?< / p>
答案 0 :(得分:1)
添加“...”(三个点),您希望YAML解析器停止。
答案 1 :(得分:0)
这是我的解决方案(Scala代码):
import java.io.InputStreamReader
import java.io.InputStream
import java.nio.charset.Charset
import scala.collection.mutable.Queue
/**
* Reader for Metadata that is contained in the given `InputStream`.
*
* @constructor Create a new metadata reader with a given `Charset`.
* @param in underlying input stream
* @param charset encoding of the stream
*/
class MetadataReader(in: InputStream, charset: Charset)
extends InputStreamReader(in, charset) {
private val lookahead = Queue.empty[Int] // buffer for looking ahead
private var afterNewline = true // indicates that the last char was a newline
private var divider = 0 // number of divider characters in a row ('-')
/**
* Create new MetadataReader with the systems default `Charset`.
*
* @param in underlying input stream
*/
def this(in: InputStream) = this(in, Charset.defaultCharset())
/**
* Read the next character.
*
* @return next character
*/
override def read: Int =
if (divider == 2) {
-1
} else if (!lookahead.isEmpty) {
lookahead.dequeue
} else {
// read next character
def readNext: Int =
if (lookahead.length == 3) {
divider += 1
read
} else {
val c = super.read
if (c == '-') {
lookahead.enqueue(c)
readNext
} else {
lookahead.enqueue(c)
lookahead.dequeue
}
}
readNext
}
/**
* Read characters into a buffer character array.
*
* @param buf buffer array
* @param off offset to start in the array
* @param len number of characters to read
* @return actually read characters
*/
override def read(buf: Array[Char], off: Int, len: Int): Int = {
var j = 0
for (i <- 0 until len) {
val c = read
if (c == -1)
return j
if (i >= off) {
buf(i) = c.toChar
j += 1
}
}
j
}
}
你可以这样使用它:
val yaml = new Yaml
val mr = new MetadataReader(new FileInputStream(
new File("src/test/resources/yaml-test.txt")), Charset.forName("UTF-8"))
println(yaml.load(mr))
mis.close()
反馈意见。