XSLT 2.0:将纯文本中的表示法转换为svg

时间:2016-01-08 17:03:29

标签: regex xslt xpath svg xslt-2.0

我不熟悉不同格式之间的转换。我的目标是将符号从纯文本格式的工具包转移到svg。一个简单的例子是我有一个橙色椭圆,符号就像这样(x和y是坐标系,因此0和0表示椭圆位于中间):

<svg width="400" height="400" xmlns="http://www.w3.org/2000/svg" xmlns:svg="http://www.w3.org/2000/svg">
 <g>
  <ellipse fill="#ff7f00" stroke="#000000" stroke-width="2" stroke-dasharray="null" stroke-linejoin="null" stroke-linecap="null" cx="250" cy="250" id="svg_1" rx="114" ry="70"/>
 </g>
</svg>

我想要的输出就是这样的svg代码(例子中随机选择了cx和cy坐标):

NR == 1 { 
    small = $0 
}
$0 < small {
    small = $0
}
END {
    print small
}

我找到了这两个帖子Parse text file with XSLTXSL transform on text to XML with unparsed-text: need more depth 他们使用XSLT 2.0和未解析的text()函数和正则表达式将纯文本转换为xml。在我的例子中,如何获得像ELLIPSE这样的命令(是一个 正则表达式识别所有可能的大写单词?)和参数(无论如何都可以从纯文本中获取Xpath?)?在XSLT 2.0中是否可以实现良好的实现,或者我应该这样做 寻找另一种方法?任何帮助将不胜感激!

2 个答案:

答案 0 :(得分:5)

下面是如何使用unparsed-text()加载文本文件的示例,并使用xsl:analyze-text解析内容以生成中间XML文档,然后使用&#34;转换该XML。 push&#34; -style stylesheet。

它显示了如何支持ELLIPSE,CIRCLE和RECTANGLE文本转换的示例。您可能需要对其进行一些自定义,但应该让您了解可能的内容。通过添加正则表达式和unparsed-text(),XSLT 2.0和3.0使得各种文本转换成为可能,这在XSLT 1.0中非常麻烦或困难。

使用名为&#34; drawing.txt&#34;的文件具有以下内容:

GRAPHREP
PEN color:$000000 w:2pt
FILL color:$ff7f00
ELLIPSE x:0pt y:0pt rx:114pt ry:70pt

GRAPHREP
PEN color:$000000 w:2pt
FILL color:$ff7f00
CIRCLE x:0pt y:0pt rx:114pt ry:70pt

GRAPHREP
PEN color:$000000 w:2pt
FILL color:$ff7f00
RECTANGLE x:0pt y:0pt width:114pt height:70pt

在同一目录中执行以下XSLT:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xmlns:local="local"
    exclude-result-prefixes="xs"
    version="2.0"
    xmlns:svg="http://www.w3.org/2000/svg">
    <xsl:output indent="yes"/>

    <!--matches sequences of UPPER-CASE letters -->
    <xsl:variable name="label-pattern" select="'[A-Z]+'"/>
    <!--matches the "attributes" in the line i.e. w:2pt,
        has two capture groups (1) => attribute name, (2) => attribute value -->
    <xsl:variable name="attribute-pattern" select="'\s?(\S+):(\S+)'"/> 
    <!--matches a line of data for the drawing text, 
        has two capture groups (1) => label, (2) attribute data-->
    <xsl:variable name="line-pattern" select="concat('(', $label-pattern, ')\s(.*)\n?')"/>

    <xsl:template match="@* | node()">
        <xsl:copy>
            <xsl:apply-templates select="@* | node()"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="/">
        <svg width="400" height="400">
            <g>
                <!-- Find the text patterns indicating the shape -->
                <xsl:analyze-string select="unparsed-text('drawing.txt')"
                    regex="{concat('(', $label-pattern, ')\n((', $line-pattern, ')+)\n?')}">
                    <xsl:matching-substring>
                        <!--Convert text to XML -->
                        <xsl:variable name="drawing-markup" as="element()">
                            <!--Create an element for this group, using first matched pattern as the element name 
                                (i.e. GRAPHREP => <GRAPHREP>) -->
                            <xsl:element name="{regex-group(1)}">
                                <!--split the second matched group for this shape into lines by breaking on newline-->
                                <xsl:variable name="lines" select="tokenize(regex-group(2), '\n')"/>
                                <xsl:for-each select="$lines">
                                    <!--for each line, run through this process to create an element with attributes
                                        (e.g. FILL color:$frf7f00 => <FILL color=""/>
                                    -->
                                    <xsl:analyze-string select="." regex="{$line-pattern}">
                                        <xsl:matching-substring>
                                            <!--create an element using the UPPER-CASE label starting the line -->
                                            <xsl:element name="{regex-group(1)}">
                                                <!-- capture each of the attributes -->
                                                <xsl:analyze-string select="regex-group(2)" regex="\s?(\S+):(\S+)">
                                                  <xsl:matching-substring>
                                                  <!--convert foo:bar into attribute foo="bar", 
                                                            translate $ => # 
                                                            and remove the letters 'p' and 't' by translating into nothing"-->
                                                  <xsl:attribute name="{regex-group(1)}" select="translate(regex-group(2), '$pt', '#')"/>
                                                  </xsl:matching-substring>
                                                  <xsl:non-matching-substring/>
                                                </xsl:analyze-string>
                                            </xsl:element>
                                        </xsl:matching-substring>
                                        <xsl:non-matching-substring/>
                                    </xsl:analyze-string>
                                </xsl:for-each>
                            </xsl:element>
                        </xsl:variable>
                        <!--Uncomment the copy-of below if you want to see the intermediate XML $drawing-markup-->
                        <!--<xsl:copy-of select="$drawing-markup"/>-->

                        <!-- Transform XML into SVG -->
                        <xsl:apply-templates select="$drawing-markup"/>

                    </xsl:matching-substring>
                    <xsl:non-matching-substring/>
                </xsl:analyze-string>
            </g>
        </svg>
    </xsl:template>

    <!--==========================================-->
    <!-- Templates to convert the $drawing-markup -->
    <!--==========================================-->

    <!--for supported shapes, create the element using
        lower-case value, and change rectangle to rect
        for the svg element name-->
    <xsl:template match="GRAPHREP[ELLIPSE | CIRCLE | RECTANGLE]">
        <xsl:element name="{replace(lower-case(local-name(ELLIPSE | CIRCLE | RECTANGLE)), 'rectangle', 'rect', 'i')}">
            <xsl:attribute name="id" select="concat('id_', generate-id())"/>
            <xsl:apply-templates />
        </xsl:element>
    </xsl:template>

    <xsl:template match="ELLIPSE | CIRCLE | RECTANGLE"/>

    <!-- Just process the content of GRAPHREP.
        If there are multiple shapes and you want a new 
        <svg><g></g></svg> for each shape, 
        then move it from the template for "/" into this template-->
    <xsl:template match="GRAPHREP/*">
        <xsl:apply-templates select="@*"/>
    </xsl:template>

    <xsl:template match="PEN" priority="1">
        <!--TODO: test if these attributes exist, if they do, do not create these defaults.
            Hard-coding for now, to match desired output, since I don't know what the text
            attributes would be, but could wrap each with <xsl:if test="not(@dasharray)">-->
        <xsl:attribute name="stroke-dasharray" select="'null'"/>
        <xsl:attribute name="stroke-linjoin" select="'null'"/>
        <xsl:attribute name="stroke-linecap" select="'null'"/>
        <xsl:apply-templates select="@*"/>
    </xsl:template>

    <!-- conterts @color => @stroke -->
    <xsl:template match="PEN/@color">
        <xsl:attribute name="stroke" select="."/>
    </xsl:template>

    <!--converts @w => @stroke-width -->
    <xsl:template match="PEN/@w">
        <xsl:attribute name="stroke-width" select="."/>
    </xsl:template>

    <!--converts @color => @fill and replaces $ with # -->
    <xsl:template match="FILL/@color">
        <xsl:attribute name="fill" select="translate(., '$', '#')"/>
    </xsl:template>

    <!-- converts @x => @cx with hard-coded values. 
        May want to use value from text, but matching your example-->
    <xsl:template match="ELLIPSE/@x | ELLIPSE/@y">
        <!--not sure if there was a relationship between ELLIPSE x:0pt y:0pt, and why 0pt would be 250, 
            but just an example...-->
        <xsl:attribute name="c{name()}" select="250"/>
    </xsl:template>

</xsl:stylesheet>

生成以下SVG输出:

<?xml version="1.0" encoding="UTF-8"?>
<svg xmlns:local="local"
     xmlns:svg="http://www.w3.org/2000/svg"
     width="400"
     height="400">
   <g>
      <ellipse id="id_d2e0"
               stroke-dasharray="null"
               stroke-linjoin="null"
               stroke-linecap="null"
               stroke="#000000"
               stroke-width="2"
               fill="#ff7f00"
               cx="250"
               cy="250"
               rx="114"
               ry="70"/>
      <circle id="id_d3e0"
              stroke-dasharray="null"
              stroke-linjoin="null"
              stroke-linecap="null"
              stroke="#000000"
              stroke-width="2"
              fill="#ff7f00"
              x="0"
              y="0"
              rx="114"
              ry="70"/>
      <rect id="id_d4e0"
            stroke-dasharray="null"
            stroke-linjoin="null"
            stroke-linecap="null"
            stroke="#000000"
            stroke-width="2"
            fill="#ff7f00"
            x="0"
            y="0"
            width="114"
            height="70"/>
   </g>
</svg>

答案 1 :(得分:0)

当您想要将XML文件转换为其他内容时,XSLT非常方便。但是拥有高级逻辑非常困难,因为语法笨拙且所有变量都是常量。你的解析看起来很高级,你的输入不是XML,所以XSLT不会是我在这里选择的第一个工具。

如果工具没有修复,我会用脚本语言编写解析器,例如: python,让它从输入创建对象,然后让每个对象生成自己的表示为XML。

编辑:超级简单的Python脚本可能如下所示,您可以轻松地在Java中重写它:

import re

from xml.etree.ElementTree import Element, SubElement, tostring

def only_numbers(s):
    return re.sub(r"[^0-9]", "", s)

def only_hex(s):
    return re.sub(r"[^0-9a-f]", "", s)

# Initialise svg xml
root = Element("svg")
root.set("width", "400")
root.set("height", "400")
root.set("xmlns", "http://www.w3.org/2000/svg")
g = SubElement(root, "g")

# Initialise current state
pen_color = "000000"
pen_width = "1"
fill_color = "ffffff"

# Read input
with open("parsing_data.txt") as f:
    for line in f:
        words = line.split()
        action = words[0]
        params = dict([word.split(":") for word in words[1:]])
        if action == "GRAPHREP":
            pass
        elif action == "PEN":
            if "color" in params:
                pen_color = only_hex(params["color"])
            if "w" in params:
                pen_width = only_numbers(params["w"])
        elif action == "FILL":
            if "color" in params:
                fill_color = only_hex(params["color"])
        elif action == "ELLIPSE":
            ellipse = SubElement(g, "ellipse")
            ellipse.set("fill", "#" + fill_color)
            ellipse.set("fill", "#" + fill_color)
            ellipse.set("stroke", "#" + pen_color)
            ellipse.set("stroke-width", pen_width)
            ellipse.set("cx", only_numbers(params["x"]))
            ellipse.set("cy", only_numbers(params["y"]))
            ellipse.set("rx", only_numbers(params["rx"]))
            ellipse.set("ry", only_numbers(params["ry"]))
        else:
            raise Exception("Invalid input line: " + line)

# Print result
print tostring(root)