我不熟悉不同格式之间的转换。我的目标是将符号从纯文本格式的工具包转移到svg。一个简单的例子是我有一个橙色椭圆,符号就像这样(x和y是坐标系,因此0和0表示椭圆位于中间):
<svg width="400" height="400" xmlns="http://www.w3.org/2000/svg" xmlns:svg="http://www.w3.org/2000/svg">
<g>
<ellipse fill="#ff7f00" stroke="#000000" stroke-width="2" stroke-dasharray="null" stroke-linejoin="null" stroke-linecap="null" cx="250" cy="250" id="svg_1" rx="114" ry="70"/>
</g>
</svg>
我想要的输出就是这样的svg代码(例子中随机选择了cx和cy坐标):
NR == 1 {
small = $0
}
$0 < small {
small = $0
}
END {
print small
}
我找到了这两个帖子Parse text file with XSLT和XSL transform on text to XML with unparsed-text: need more depth 他们使用XSLT 2.0和未解析的text()函数和正则表达式将纯文本转换为xml。在我的例子中,如何获得像ELLIPSE这样的命令(是一个 正则表达式识别所有可能的大写单词?)和参数(无论如何都可以从纯文本中获取Xpath?)?在XSLT 2.0中是否可以实现良好的实现,或者我应该这样做 寻找另一种方法?任何帮助将不胜感激!
答案 0 :(得分:5)
下面是如何使用unparsed-text()
加载文本文件的示例,并使用xsl:analyze-text
解析内容以生成中间XML文档,然后使用&#34;转换该XML。 push&#34; -style stylesheet。
它显示了如何支持ELLIPSE,CIRCLE和RECTANGLE文本转换的示例。您可能需要对其进行一些自定义,但应该让您了解可能的内容。通过添加正则表达式和unparsed-text()
,XSLT 2.0和3.0使得各种文本转换成为可能,这在XSLT 1.0中非常麻烦或困难。
使用名为&#34; drawing.txt&#34;的文件具有以下内容:
GRAPHREP
PEN color:$000000 w:2pt
FILL color:$ff7f00
ELLIPSE x:0pt y:0pt rx:114pt ry:70pt
GRAPHREP
PEN color:$000000 w:2pt
FILL color:$ff7f00
CIRCLE x:0pt y:0pt rx:114pt ry:70pt
GRAPHREP
PEN color:$000000 w:2pt
FILL color:$ff7f00
RECTANGLE x:0pt y:0pt width:114pt height:70pt
在同一目录中执行以下XSLT:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:local="local"
exclude-result-prefixes="xs"
version="2.0"
xmlns:svg="http://www.w3.org/2000/svg">
<xsl:output indent="yes"/>
<!--matches sequences of UPPER-CASE letters -->
<xsl:variable name="label-pattern" select="'[A-Z]+'"/>
<!--matches the "attributes" in the line i.e. w:2pt,
has two capture groups (1) => attribute name, (2) => attribute value -->
<xsl:variable name="attribute-pattern" select="'\s?(\S+):(\S+)'"/>
<!--matches a line of data for the drawing text,
has two capture groups (1) => label, (2) attribute data-->
<xsl:variable name="line-pattern" select="concat('(', $label-pattern, ')\s(.*)\n?')"/>
<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="/">
<svg width="400" height="400">
<g>
<!-- Find the text patterns indicating the shape -->
<xsl:analyze-string select="unparsed-text('drawing.txt')"
regex="{concat('(', $label-pattern, ')\n((', $line-pattern, ')+)\n?')}">
<xsl:matching-substring>
<!--Convert text to XML -->
<xsl:variable name="drawing-markup" as="element()">
<!--Create an element for this group, using first matched pattern as the element name
(i.e. GRAPHREP => <GRAPHREP>) -->
<xsl:element name="{regex-group(1)}">
<!--split the second matched group for this shape into lines by breaking on newline-->
<xsl:variable name="lines" select="tokenize(regex-group(2), '\n')"/>
<xsl:for-each select="$lines">
<!--for each line, run through this process to create an element with attributes
(e.g. FILL color:$frf7f00 => <FILL color=""/>
-->
<xsl:analyze-string select="." regex="{$line-pattern}">
<xsl:matching-substring>
<!--create an element using the UPPER-CASE label starting the line -->
<xsl:element name="{regex-group(1)}">
<!-- capture each of the attributes -->
<xsl:analyze-string select="regex-group(2)" regex="\s?(\S+):(\S+)">
<xsl:matching-substring>
<!--convert foo:bar into attribute foo="bar",
translate $ => #
and remove the letters 'p' and 't' by translating into nothing"-->
<xsl:attribute name="{regex-group(1)}" select="translate(regex-group(2), '$pt', '#')"/>
</xsl:matching-substring>
<xsl:non-matching-substring/>
</xsl:analyze-string>
</xsl:element>
</xsl:matching-substring>
<xsl:non-matching-substring/>
</xsl:analyze-string>
</xsl:for-each>
</xsl:element>
</xsl:variable>
<!--Uncomment the copy-of below if you want to see the intermediate XML $drawing-markup-->
<!--<xsl:copy-of select="$drawing-markup"/>-->
<!-- Transform XML into SVG -->
<xsl:apply-templates select="$drawing-markup"/>
</xsl:matching-substring>
<xsl:non-matching-substring/>
</xsl:analyze-string>
</g>
</svg>
</xsl:template>
<!--==========================================-->
<!-- Templates to convert the $drawing-markup -->
<!--==========================================-->
<!--for supported shapes, create the element using
lower-case value, and change rectangle to rect
for the svg element name-->
<xsl:template match="GRAPHREP[ELLIPSE | CIRCLE | RECTANGLE]">
<xsl:element name="{replace(lower-case(local-name(ELLIPSE | CIRCLE | RECTANGLE)), 'rectangle', 'rect', 'i')}">
<xsl:attribute name="id" select="concat('id_', generate-id())"/>
<xsl:apply-templates />
</xsl:element>
</xsl:template>
<xsl:template match="ELLIPSE | CIRCLE | RECTANGLE"/>
<!-- Just process the content of GRAPHREP.
If there are multiple shapes and you want a new
<svg><g></g></svg> for each shape,
then move it from the template for "/" into this template-->
<xsl:template match="GRAPHREP/*">
<xsl:apply-templates select="@*"/>
</xsl:template>
<xsl:template match="PEN" priority="1">
<!--TODO: test if these attributes exist, if they do, do not create these defaults.
Hard-coding for now, to match desired output, since I don't know what the text
attributes would be, but could wrap each with <xsl:if test="not(@dasharray)">-->
<xsl:attribute name="stroke-dasharray" select="'null'"/>
<xsl:attribute name="stroke-linjoin" select="'null'"/>
<xsl:attribute name="stroke-linecap" select="'null'"/>
<xsl:apply-templates select="@*"/>
</xsl:template>
<!-- conterts @color => @stroke -->
<xsl:template match="PEN/@color">
<xsl:attribute name="stroke" select="."/>
</xsl:template>
<!--converts @w => @stroke-width -->
<xsl:template match="PEN/@w">
<xsl:attribute name="stroke-width" select="."/>
</xsl:template>
<!--converts @color => @fill and replaces $ with # -->
<xsl:template match="FILL/@color">
<xsl:attribute name="fill" select="translate(., '$', '#')"/>
</xsl:template>
<!-- converts @x => @cx with hard-coded values.
May want to use value from text, but matching your example-->
<xsl:template match="ELLIPSE/@x | ELLIPSE/@y">
<!--not sure if there was a relationship between ELLIPSE x:0pt y:0pt, and why 0pt would be 250,
but just an example...-->
<xsl:attribute name="c{name()}" select="250"/>
</xsl:template>
</xsl:stylesheet>
生成以下SVG输出:
<?xml version="1.0" encoding="UTF-8"?>
<svg xmlns:local="local"
xmlns:svg="http://www.w3.org/2000/svg"
width="400"
height="400">
<g>
<ellipse id="id_d2e0"
stroke-dasharray="null"
stroke-linjoin="null"
stroke-linecap="null"
stroke="#000000"
stroke-width="2"
fill="#ff7f00"
cx="250"
cy="250"
rx="114"
ry="70"/>
<circle id="id_d3e0"
stroke-dasharray="null"
stroke-linjoin="null"
stroke-linecap="null"
stroke="#000000"
stroke-width="2"
fill="#ff7f00"
x="0"
y="0"
rx="114"
ry="70"/>
<rect id="id_d4e0"
stroke-dasharray="null"
stroke-linjoin="null"
stroke-linecap="null"
stroke="#000000"
stroke-width="2"
fill="#ff7f00"
x="0"
y="0"
width="114"
height="70"/>
</g>
</svg>
答案 1 :(得分:0)
当您想要将XML文件转换为其他内容时,XSLT非常方便。但是拥有高级逻辑非常困难,因为语法笨拙且所有变量都是常量。你的解析看起来很高级,你的输入不是XML,所以XSLT不会是我在这里选择的第一个工具。
如果工具没有修复,我会用脚本语言编写解析器,例如: python,让它从输入创建对象,然后让每个对象生成自己的表示为XML。
编辑:超级简单的Python脚本可能如下所示,您可以轻松地在Java中重写它:
import re
from xml.etree.ElementTree import Element, SubElement, tostring
def only_numbers(s):
return re.sub(r"[^0-9]", "", s)
def only_hex(s):
return re.sub(r"[^0-9a-f]", "", s)
# Initialise svg xml
root = Element("svg")
root.set("width", "400")
root.set("height", "400")
root.set("xmlns", "http://www.w3.org/2000/svg")
g = SubElement(root, "g")
# Initialise current state
pen_color = "000000"
pen_width = "1"
fill_color = "ffffff"
# Read input
with open("parsing_data.txt") as f:
for line in f:
words = line.split()
action = words[0]
params = dict([word.split(":") for word in words[1:]])
if action == "GRAPHREP":
pass
elif action == "PEN":
if "color" in params:
pen_color = only_hex(params["color"])
if "w" in params:
pen_width = only_numbers(params["w"])
elif action == "FILL":
if "color" in params:
fill_color = only_hex(params["color"])
elif action == "ELLIPSE":
ellipse = SubElement(g, "ellipse")
ellipse.set("fill", "#" + fill_color)
ellipse.set("fill", "#" + fill_color)
ellipse.set("stroke", "#" + pen_color)
ellipse.set("stroke-width", pen_width)
ellipse.set("cx", only_numbers(params["x"]))
ellipse.set("cy", only_numbers(params["y"]))
ellipse.set("rx", only_numbers(params["rx"]))
ellipse.set("ry", only_numbers(params["ry"]))
else:
raise Exception("Invalid input line: " + line)
# Print result
print tostring(root)