我需要从一段纯文本以及应插入的每个XML元素的开始和结束偏移创建一个XML文档。以下是我希望通过的一些测试用例:
val text = "The dog chased the cat."
val spans = Seq(
(0, 23, <xml/>),
(4, 22, <phrase/>),
(4, 7, <token/>))
val expected = <xml>The <phrase><token>dog</token> chased the cat</phrase>.</xml>
assert(expected === spansToXML(text, spans))
val text = "aabbccdd"
val spans = Seq(
(0, 8, <xml x="1"/>),
(0, 4, <ab y="foo"/>),
(4, 8, <cd z="42>3"/>))
val expected = <xml x="1"><ab y="foo">aabb</ab><cd z="42>3">ccdd</cd></xml>
assert(expected === spansToXML(text, spans))
val spans = Seq(
(0, 1, <a/>),
(0, 0, <b/>),
(0, 0, <c/>),
(1, 1, <d/>),
(1, 1, <e/>))
assert(<a><b/><c/> <d/><e/></a> === spansToXML(" ", spans))
我的部分解决方案(请参阅下面的答案)通过字符串连接和XML.loadString
工作。这看起来很糟糕,我也不是100%确定这个解决方案在所有角落情况下都能正常工作......
有更好的解决方案吗? (对于它的价值,我很乐意切换到anti-xml,如果这样可以使这项任务更容易。)
2011年8月10日更新以添加更多测试用例并提供更清晰的规范。
答案 0 :(得分:3)
鉴于你提出的赏金,我研究了你的问题已经有一段时间了,并提出了以下解决方案,它在你所有的测试用例上取得了成功。 我真的希望接受我的回答 - 请告诉我我的解决方案是否有任何问题。
一些评论: 如果你想知道执行过程中发生了什么,我把评论出来的打印声明留在里面。 除了你的规范,我确实保留了他们现有的孩子(如果有的话) - 这里有一个评论。
我没有手动构建XML节点,我修改了传入的XML节点。为了避免拆分开始和结束标记,我不得不更改算法,但是按begin
排序跨度的想法-end
来自您的解决方案。
代码有点高级Scala,特别是当我构建我需要的不同Orderings
时。我从我得到的第一个版本中做了一些简化。
我通过使用SortedMap
避免创建表示间隔的树,并在提取后过滤间隔。这种选择有点不理想;然而,我听说有更好的数据结构用于表示嵌套间隔,例如间隔树(它们在计算几何中研究),但实现起来相当复杂,我不认为这里需要它。
/**
* User: pgiarrusso
* Date: 12/8/2011
*/
import collection.mutable.ArrayBuffer
import collection.SortedMap
import scala.xml._
object SpansToXmlTest {
def spansToXML(text: String, spans: Seq[(Int, Int, Elem)]) = {
val intOrdering = implicitly[Ordering[Int]] // Retrieves the standard ordering on Ints.
// Sort spans decreasingly on begin and increasingly on end and their label - this processes spans outwards.
// The sorting on labels matches the given examples.
val spanOrder = Ordering.Tuple3(intOrdering.reverse, intOrdering, Ordering.by((_: Elem).label))
//Same sorting, excluding labels.
val intervalOrder = Ordering.Tuple2(intOrdering.reverse, intOrdering)
//Map intervals of the source string to the sequence of nodes which match them - it is a sequence because
//multiple spans for the same interval are allowed.
var intervalMap = SortedMap[(Int, Int), Seq[Node]]()(intervalOrder)
for ((start, end, elem) <- spans.sorted(spanOrder)) {
//Only nested intervals. Interval nesting is a partial order, therefore we cannot use the filter function as an ordering for intervalMap, even if it would be nice.
val nestedIntervalsMap = intervalMap.until((start, end)).filter(_ match {
case ((intStart, intEnd), _) => start <= intStart && intEnd <= end
})
//println("intervalMap: " + intervalMap)
//println("beforeMap: " + nestedIntervalsMap)
//We call sorted to use a standard ordering this time.
val before = nestedIntervalsMap.keys.toSeq.sorted
// text.slice(start, end) must be split into fragments, some of which are represented by text node, some by
// already computed xml nodes.
val intervals = start +: (for {
(intStart, intEnd) <- before
boundary <- Seq(intStart, intEnd)
} yield boundary) :+ end
var xmlChildren = ArrayBuffer[Node]()
var useXmlNode = false
for (interv <- intervals.sliding(2)) {
val intervStart = interv(0)
val intervEnd = interv(1)
xmlChildren.++=(
if (useXmlNode)
intervalMap((intervStart, intervEnd)) //Precomputed nodes
else
Seq(Text(text.slice(intervStart, intervEnd))))
useXmlNode = !useXmlNode //The next interval will be of the opposite kind.
}
//Remove intervals that we just processed
intervalMap = intervalMap -- before
// By using elem.child, you also preserve existing xml children. "elem.child ++" can be also commented out.
var tree = elem.copy(child = elem.child ++ xmlChildren)
intervalMap += (start, end) -> (intervalMap.getOrElse((start, end), Seq.empty) :+ tree)
//println(tree)
}
intervalMap((0, text.length)).head
}
def test(text: String, spans: Seq[(Int, Int, Elem)], expected: Node) {
val res = spansToXML(text, spans)
print("Text: \"%s\", expected:\n%s\nResult:\n%s\n\n" format (text, expected, res))
assert(expected == res)
}
def test1() =
test(
text = "The dog chased the cat.",
spans = Seq(
(0, 23, <xml/>),
(4, 22, <phrase/>),
(4, 7, <token/>)),
expected = <xml>The <phrase><token>dog</token> chased the cat</phrase>.</xml>
)
def test2() =
test(
text = "aabbccdd",
spans = Seq(
(0, 8, <xml x="1"/>),
(0, 4, <ab y="foo"/>),
(4, 8, <cd z="42>3"/>)),
expected = <xml x="1"><ab y="foo">aabb</ab><cd z="42>3">ccdd</cd></xml>
)
def test3() =
test(
text = " ",
spans = Seq(
(0, 1, <a/>),
(0, 0, <b/>),
(0, 0, <c/>),
(1, 1, <d/>),
(1, 1, <e/>)),
expected = <a><b/><c/> <d/><e/></a>
)
def main(args: Array[String]) {
test1()
test2()
test3()
}
}
答案 1 :(得分:2)
这很有趣!
我采取了类似于史蒂夫的方法。 通过在“开始标签”和“结束标签”中对元素进行排序,然后计算放置它们的位置。
我无耻地偷走了Blaisorblade的测试版,并添加了一些帮助我开发代码的测试。
于2011-08-14编辑
我不满意如何在test-5中插入空标签。然而,这个放置的地方是test-3如何制定的结果
所以,我稍微改变了一些测试,并提供了另一种解决方案。
在替代解决方案中,我以相同的方式启动,但有3个单独的列表,start,empty和closing标签。而不是仅排序我有第三步,空标签被放入标签列表。
第一个解决方案:
import xml.{XML, Elem, Node}
import annotation.tailrec
object SpanToXml {
def spansToXML(text: String, spans: Seq[(Int, Int, Elem)]): Node = {
// Create a Seq of elements, sorted by where it should be inserted
// differentiate start tags ('s) and empty tags ('e)
val startElms = spans sorted Ordering[Int].on[(Int, _, _)](_._1) map {
case e if e._1 != e._2 => (e._1, e._3, 's)
case e => (e._1, e._3, 'e)
}
//Create a Seq of closing tags ('c), sorted by where they should be inserted
// filter out all empty tags
val endElms = spans.reverse.sorted(Ordering[Int].on[(_, Int, _)](_._2))
.filter(e => e._1 != e._2)
.map(e => (e._2, e._3, 'c))
//Combine the Seq's and sort by insertion point
val elms = startElms ++ endElms sorted Ordering[Int].on[(Int, _, _)](_._1)
//The sorting need to be refined
// - end tag's need to come before start tag's if the insertion point is thesame
val sorted = elms.sortWith((a, b) => a._1 == b._1 && a._3 == 'c && b._3 == 's )
//Adjust the insertion point to what it should be in the final string
// then insert the tags into the text by folding left
// - there are different rules depending on start, empty or close
val txt = adjustInset(sorted).foldLeft(text)((tx, e) => {
val s = tx.splitAt(e._1)
e match {
case (_, elem, 's) => s._1 + "<" + elem.label + elem.attributes + ">" + s._2
case (_, elem, 'e) => s._1 + "<" + elem.label + elem.attributes + "/>" + s._2
case (_, elem, 'c) => s._1 + "</" + elem.label + ">" + s._2
}
})
//Sanity check
//println(txt)
//Convert to XML
XML.loadString(txt)
}
def adjustInset(elems: Seq[(Int, Elem, Symbol)]): Seq[(Int, Elem, Symbol)] = {
@tailrec
def adjIns(elems: Seq[(Int, Elem, Symbol)], tmp: Seq[(Int, Elem, Symbol)]): Seq[(Int, Elem, Symbol)] =
elems match {
case t :: Nil => tmp :+ t
case t :: ts => {
//calculate offset due to current element
val offset = t match {
case (_, e, 's) => e.label.size + e.attributes.toString.size + 2
case (_, e, 'e) => e.label.size + e.attributes.toString.size + 3
case (_, e, 'c) => e.label.size + 3
}
//add offset to all elm's in tail, and recurse
adjIns(ts.map(e => (e._1 + offset, e._2, e._3)), tmp :+ t)
}
}
adjIns(elems, Nil)
}
def test(text: String, spans: Seq[(Int, Int, Elem)], expected: Node) {
val res = spansToXML(text, spans)
print("Text: \"%s\", expected:\n%s\nResult:\n%s\n\n" format (text, expected, res))
assert(expected == res)
}
def test1() =
test(
text = "The dog chased the cat.",
spans = Seq(
(0, 23, <xml/>),
(4, 22, <phrase/>),
(4, 7, <token/>)),
expected = <xml>The <phrase><token>dog</token> chased the cat</phrase>.</xml>
)
def test2() =
test(
text = "aabbccdd",
spans = Seq(
(0, 8, <xml x="1"/>),
(0, 4, <ab y="foo"/>),
(4, 8, <cd z="42>3"/>)),
expected = <xml x="1"><ab y="foo">aabb</ab><cd z="42>3">ccdd</cd></xml>
)
def test3() =
test(
text = " ",
spans = Seq(
(0, 1, <a/>),
(0, 0, <b/>),
(0, 0, <c/>),
(1, 1, <d/>),
(1, 1, <e/>)),
expected = <a><b/><c/> <d/><e/></a>
)
def test4() =
test(
text = "aabbccdd",
spans = Seq((0, 8, <xml/>),
(0, 4, <ab/>),
(4, 8, <cd/>),
(4, 6, <ok/>)),
expected = <xml><ab>aabb</ab><cd><ok>cc</ok>dd</cd></xml>
)
def test5() =
test(
text = "aabbccdd",
spans = Seq((0, 8, <xml/>),
(0, 4, <ab/>),
(2, 4, <b/>),
(4, 4, <empty/>),
(4, 8, <cd/>),
(4, 6, <ok/>)),
expected = <xml><ab>aa<b>bb<empty/></b></ab><cd><ok>cc</ok>dd</cd></xml>
)
def test6() =
test(
text = "aabbccdd",
spans = Seq((0, 8, <xml/>),
(0, 4, <ab/>),
(2, 4, <b/>),
(2, 4, <c/>),
(3, 4, <d/>),
(4, 8, <cd/>),
(4, 6, <ok/>)),
expected = <xml><ab>aa<b><c>b<d>b</d></c></b></ab><cd><ok>cc</ok>dd</cd></xml>
)
def test7() =
test(
text = "aabbccdd",
spans = Seq((0, 8, <xml/>),
(0, 4, <ab a="a" b="b"/>),
(4, 8, <cd c="c" d="d"/>)),
expected = <xml><ab a="a" b="b">aabb</ab><cd c="c" d="d">ccdd</cd></xml>
)
def invalidSpans() = {
val text = "aabbccdd"
val spans = Seq((0, 8, <xml/>),
(0, 4, <ab/>),
(4, 6, <err/>),
(4, 8, <cd/>))
try {
val res = spansToXML(text, spans)
assert(false)
} catch {
case e => {
println("This generate invalid XML:")
println("<xml><ab>aabb</ab><err><cd>cc</err>dd</cd></xml>")
println(e.getMessage)
}
}
}
def main(args: Array[String]) {
test1()
test2()
test3()
test4()
test5()
test6()
test7()
invalidSpans()
}
}
SpanToXml.main(Array())
替代解决方案:
import xml.{XML, Elem, Node}
import annotation.tailrec
object SpanToXmlAlt {
def spansToXML(text: String, spans: Seq[(Int, Int, Elem)]): Node = {
// Create a Seq of start tags, sorted by where it should be inserted
// filter out all empty tags
val startElms = spans.sorted(Ordering[Int].on[(Int, _, _)](_._1))
.filterNot(e => e._1 == e._2)
.map(e => (e._1, e._3, 's))
//Create a Seq of closing tags, sorted by where they should be inserted
// filter out all empty tags
val endElms = spans.reverse.sorted(Ordering[Int].on[(_, Int, _)](_._2))
.filterNot(e => e._1 == e._2)
.map(e => (e._2, e._3, 'c))
//Create a Seq of empty tags, sorted by where they should be inserted
val emptyElms = spans.sorted(Ordering[Int].on[(Int, _, _)](_._1))
.filter(e => e._1 == e._2)
.map(e => (e._1, e._3, 'e))
//Combine the Seq's and sort by insertion point
val elms = startElms ++ endElms sorted Ordering[Int].on[(Int, _, _)](_._1)
//The sorting need to be refined
// - end tag's need to come before start tag's if the insertion point is the same
val sorted = elms.sortWith((a, b) => a._1 == b._1 && a._3 == 'c && b._3 == 's )
//Insert empty tags
val allSorted = insertEmpyt(spans, sorted, emptyElms) sorted Ordering[Int].on[(Int, _, _)](_._1)
//Adjust the insertion point to what it should be in the final string
// then insert the tags into the text by folding left
// - there are different rules depending on start, empty or close
val str = adjustInset(allSorted).foldLeft(text)((tx, e) => {
val s = tx.splitAt(e._1)
e match {
case (_, elem, 's) => s._1 + "<" + elem.label + elem.attributes + ">" + s._2
case (_, elem, 'e) => s._1 + "<" + elem.label + elem.attributes + "/>" + s._2
case (_, elem, 'c) => s._1 + "</" + elem.label + ">" + s._2
}
})
//Sanity check
//println(str)
//Convert to XML
XML.loadString(str)
}
def insertEmpyt(spans: Seq[(Int, Int, Elem)],
sorted: Seq[(Int, Elem, Symbol)],
emptys: Seq[(Int, Elem, Symbol)]): Seq[(Int, Elem, Symbol)] = {
//Find all tags that should be before the empty tag
@tailrec
def afterSpan(empty: (Int, Elem, Symbol),
spans: Seq[(Int, Int, Elem)],
after: Seq[(Int, Elem, Symbol)]): Seq[(Int, Elem, Symbol)] = {
var result = after
spans match {
case t :: _ if t._1 == empty._1 && t._2 == empty._1 && t._3 == empty._2 => after //break
case t :: ts if t._1 == t._2 => afterSpan(empty, ts, after :+ (t._1, t._3, 'e))
case t :: ts => {
if (t._1 <= empty._1) result = result :+ (t._1, t._3, 's)
if (t._2 <= empty._1) result = result :+ (t._2, t._3, 'c)
afterSpan(empty, ts, result)
}
}
}
//For each empty tag, insert it in the sorted list
var result = sorted
emptys.foreach(e => {
val afterSpans = afterSpan(e, spans, Seq[(Int, Elem, Symbol)]())
var emptyInserted = false
result = result.foldLeft(Seq[(Int, Elem, Symbol)]())((res, s) => {
if (afterSpans.contains(s) || emptyInserted) {
res :+ s
} else {
emptyInserted = true
res :+ e :+ s
}
})
})
result
}
def adjustInset(elems: Seq[(Int, Elem, Symbol)]): Seq[(Int, Elem, Symbol)] = {
@tailrec
def adjIns(elems: Seq[(Int, Elem, Symbol)], tmp: Seq[(Int, Elem, Symbol)]): Seq[(Int, Elem, Symbol)] =
elems match {
case t :: Nil => tmp :+ t
case t :: ts => {
//calculate offset due to current element
val offset = t match {
case (_, e, 's) => e.label.size + e.attributes.toString.size + 2
case (_, e, 'e) => e.label.size + e.attributes.toString.size + 3
case (_, e, 'c) => e.label.size + 3
}
//add offset to all elm's in tail, and recurse
adjIns(ts.map(e => (e._1 + offset, e._2, e._3)), tmp :+ t)
}
}
adjIns(elems, Nil)
}
def test(text: String, spans: Seq[(Int, Int, Elem)], expected: Node) {
val res = spansToXML(text, spans)
print("Text: \"%s\", expected:\n%s\nResult:\n%s\n\n" format (text, expected, res))
assert(expected == res)
}
def test1() =
test(
text = "The dog chased the cat.",
spans = Seq(
(0, 23, <xml/>),
(4, 22, <phrase/>),
(4, 7, <token/>)),
expected = <xml>The <phrase><token>dog</token> chased the cat</phrase>.</xml>
)
def test2() =
test(
text = "aabbccdd",
spans = Seq(
(0, 8, <xml x="1"/>),
(0, 4, <ab y="foo"/>),
(4, 8, <cd z="42>3"/>)),
expected = <xml x="1"><ab y="foo">aabb</ab><cd z="42>3">ccdd</cd></xml>
)
def test3alt() =
test(
text = " ",
spans = Seq(
(0, 2, <a/>),
(0, 0, <b/>),
(0, 0, <c/>),
(1, 1, <d/>),
(1, 1, <e/>)),
expected = <a><b/><c/> <d/><e/> </a>
)
def test4() =
test(
text = "aabbccdd",
spans = Seq((0, 8, <xml/>),
(0, 4, <ab/>),
(4, 8, <cd/>),
(4, 6, <ok/>)),
expected = <xml><ab>aabb</ab><cd><ok>cc</ok>dd</cd></xml>
)
def test5alt() =
test(
text = "aabbccdd",
spans = Seq((0, 8, <xml/>),
(0, 4, <ab/>),
(2, 4, <b/>),
(4, 4, <empty/>),
(4, 8, <cd/>),
(4, 6, <ok/>)),
expected = <xml><ab>aa<b>bb</b></ab><empty/><cd><ok>cc</ok>dd</cd></xml>
)
def test5b() =
test(
text = "aabbccdd",
spans = Seq((0, 8, <xml/>),
(0, 4, <ab/>),
(2, 2, <empty1/>),
(4, 4, <empty2/>),
(2, 4, <b/>),
(2, 2, <empty3/>),
(4, 4, <empty4/>),
(4, 8, <cd/>),
(4, 6, <ok/>)),
expected = <xml><ab>aa<empty1/><b><empty3/>bb<empty2/></b></ab><empty4/><cd><ok>cc</ok>dd</cd></xml>
)
def test6() =
test(
text = "aabbccdd",
spans = Seq((0, 8, <xml/>),
(0, 4, <ab/>),
(2, 4, <b/>),
(2, 4, <c/>),
(3, 4, <d/>),
(4, 8, <cd/>),
(4, 6, <ok/>)),
expected = <xml><ab>aa<b><c>b<d>b</d></c></b></ab><cd><ok>cc</ok>dd</cd></xml>
)
def test7() =
test(
text = "aabbccdd",
spans = Seq((0, 8, <xml/>),
(0, 4, <ab a="a" b="b"/>),
(4, 8, <cd c="c" d="d"/>)),
expected = <xml><ab a="a" b="b">aabb</ab><cd c="c" d="d">ccdd</cd></xml>
)
def failedSpans() = {
val text = "aabbccdd"
val spans = Seq((0, 8, <xml/>),
(0, 4, <ab/>),
(4, 6, <err/>),
(4, 8, <cd/>))
try {
val res = spansToXML(text, spans)
assert(false)
} catch {
case e => {
println("This generate invalid XML:")
println("<xml><ab>aabb</ab><err><cd>cc</err>dd</cd></xml>")
println(e.getMessage)
}
}
}
def main(args: Array[String]) {
test1()
test2()
test3alt()
test4()
test5alt()
test5b()
test6()
test7()
failedSpans()
}
}
SpanToXmlAlt.main(Array())
答案 2 :(得分:1)
我的解决方案是递归的。我根据需要对输入Seq
进行排序,并将其转换为List
。之后根据规格需要进行基本模式匹配。我的解决方案的最大缺点是,虽然.toString
在测试方法==
中生成相同的字符串,但不会产生真实。
import scala.xml.{NodeSeq, Elem, Text}
object SpansToXml {
type NodeSpan = (Int, Int, Elem)
def adjustIndices(offset: Int, spans: List[NodeSpan]) = spans.map {
case (spanStart, spanEnd, spanNode) => (spanStart - offset, spanEnd - offset, spanNode)
}
def sortedSpansToXml(text: String, spans: List[NodeSpan]): NodeSeq = {
spans match {
// current span starts and ends at index 0, thus no inner text exists
case (0, 0, node) :: rest => node +: sortedSpansToXml(text, rest)
// current span starts at index 0 and ends somewhere greater than 0
case (0, end, node) :: rest =>
// partition the text and the remaining spans in inner and outer and process both independently
val (innerSpans, outerSpans) = rest.partition {
case (spanStart, spanEnd, spanNode) => spanStart <= end && spanEnd <= end
}
val (innerText, outerText) = text.splitAt(end)
// prepend the generated node to the outer xml
node.copy(child = node.child ++ sortedSpansToXml(innerText, innerSpans)) +: sortedSpansToXml(outerText, adjustIndices(end, outerSpans))
// current span has starts at an index larger than 0, convert text prefix to text node
case (start, end, node) :: rest =>
val (pre, spanned) = text.splitAt(start)
Text(pre) +: sortedSpansToXml(spanned, adjustIndices(start, spans))
// all spans consumed: we can just return the text as node
case Nil =>
Text(text)
}
}
def spansToXml(xmlText: String, nodeSpans: Seq[NodeSpan]) = {
val sortedSpans = nodeSpans.toList.sortBy {
case (start, end, _) => (start, -end)
}
sortedSpansToXml(xmlText, sortedSpans)
}
// test code stolen from Blaisorblade and david.rosell
def test(text: String, spans: Seq[(Int, Int, Elem)], expected: NodeSeq) {
val res = spansToXml(text, spans)
print("Text: \"%s\", expected:\n%s\nResult:\n%s\n\n" format (text, expected, res))
// Had to resort on to string here.
assert(expected.toString == res.toString)
}
def test1() =
test(
text = "The dog chased the cat.",
spans = Seq((0, 23, <xml/>),(4, 22, <phrase/>),(4, 7, <token/>)),
expected = <xml>The <phrase><token>dog</token> chased the cat</phrase>.</xml>
)
def test2() =
test(
text = "aabbccdd",
spans = Seq(
(0, 8, <xml x="1"/>),
(0, 4, <ab y="foo"/>),
(4, 8, <cd z="42>3"/>)),
expected = <xml x="1"><ab y="foo">aabb</ab><cd z="42>3">ccdd</cd></xml>
)
def test3() =
test(
text = " ",
spans = Seq(
(0, 1, <a/>),
(0, 0, <b/>),
(0, 0, <c/>),
(1, 1, <d/>),
(1, 1, <e/>)),
expected = <a><b/><c/> <d/><e/></a>
)
def test4() =
test(
text = "aabbccdd",
spans = Seq((0, 8, <xml/>),
(0, 4, <ab/>),
(4, 8, <cd/>),
(4, 6, <ok/>)),
expected = <xml><ab>aabb</ab><cd><ok>cc</ok>dd</cd></xml>
)
def test5() =
test(
text = "aabbccdd",
spans = Seq((0, 8, <xml/>),
(0, 4, <ab/>),
(2, 4, <b/>),
(4, 4, <empty/>),
(4, 8, <cd/>),
(4, 6, <ok/>)),
expected = <xml><ab>aa<b>bb<empty/></b></ab><cd><ok>cc</ok>dd</cd></xml>
)
def test6() =
test(
text = "aabbccdd",
spans = Seq((0, 8, <xml/>),
(0, 4, <ab/>),
(2, 4, <b/>),
(2, 4, <c/>),
(3, 4, <d/>),
(4, 8, <cd/>),
(4, 6, <ok/>)),
expected = <xml><ab>aa<b><c>b<d>b</d></c></b></ab><cd><ok>cc</ok>dd</cd></xml>
)
def test7() =
test(
text = "aabbccdd",
spans = Seq((0, 8, <xml/>),
(0, 4, <ab a="a" b="b"/>),
(4, 8, <cd c="c" d="d"/>)),
expected = <xml><ab a="a" b="b">aabb</ab><cd c="c" d="d">ccdd</cd></xml>
)
}
答案 3 :(得分:0)
您可以轻松地动态创建XML节点:
scala> import scala.xml._
import scala.xml._
scala> Elem(null, "AAA",xml.Null,xml.TopScope, Array[Node]():_*)
res2: scala.xml.Elem = <AAA></AAA>
以下是Elem.apply签名def apply (prefix: String, label: String, attributes: MetaData, scope: NamespaceBinding, child: Node*) : Elem
我用这种方法看到的唯一问题是你需要首先构建内部节点。
让事情变得更轻松的事情:
scala> def elem(name:String, children:Node*) = Elem(null, name ,xml.Null,xml.TopScope, children:_*); def elem(name:String):Elem=elem(name, Array[Node]():_*);
scala> elem("A",elem("B"))
res11: scala.xml.Elem = <A><B></B></A>
答案 4 :(得分:0)
这是一个使用字符串连接和XML.loadString
接近正确的解决方案:
def spansToXML(text: String, spans: Seq[(Int, Int, Elem)]): Node = {
// arrange items so that at each offset:
// closing tags sort before opening tags
// with two opening tags, the one with the later closing tag sorts first
// with two closing tags, the one with the later opening tag sorts first
val items = Buffer[(Int, Int, Int, String)]()
for ((begin, end, elem) <- spans) {
val elemStr = elem.toString
val splitIndex = elemStr.indexOf('>') + 1
val beginTag = elemStr.substring(0, splitIndex)
val endTag = elemStr.substring(splitIndex)
items += ((begin, +1, -end, beginTag))
items += ((end, -1, -begin, endTag))
}
// group tags to be inserted by index
val inserts = Map[Int, Buffer[String]]()
for ((index, _, _, tag) <- items.sorted) {
inserts.getOrElseUpdate(index, Buffer[String]()) += tag
}
// put tags and characters into a buffer
val result = Buffer[String]()
for (i <- 0 until text.size + 1) {
for (tags <- inserts.get(i); tag <- tags) {
result += tag
}
result += text.slice(i, i + 1)
}
// create XML from the string buffer
XML.loadString(result.mkString)
}
这会传递前两个测试用例,但在第三个测试用例上失败。