很抱歉,如果已经问过类似的问题,我对xsl还是陌生的,找不到合适的答案。
我正在尝试将XML转换为另一个XML文件。问题是,在输入xml中,我仅有的节点是* 564630f - (HEAD -> kimura, master) Merge branch 'kimura' into 'master' (Mon Jan 28 16:01:30 2019 +0000) <nshephard>
|\
| * c6c75ba - Updating to use RSA keys as seahorse doesn't like ed25519 keys (Mon Jan 28 16:00:35 2019 +0000) <slackline>
|/
* 8464bcc - Merge branch 'kimura' into 'master' (Sat Jan 26 09:32:05 2019 +0000) <nshephard>
|\
| * ef61530 - Resolving conflict to merge master into kimura (Sat Jan 26 09:30:54 2019 +0000) <slackline>
| |\
| | * 1ece19c - (origin/master) Reinstated sourcing of virtualenvwrapper.sh on work host (Fri Jan 18 10:52:17 2019 +0000) <Neil Shephard>
| | * 80efc48 - Merge branch 'kimura' into 'master' (Mon Dec 17 18:10:50 2018 +0000) <nshephard>
| | |\
* | | \ 6138abc - Merge branch 'kimura' into 'master' (Thu Jan 24 11:39:35 2019 +0000) <nshephard>
|\ \ \ \
| |/ / /
| * | | f32a089 - tweaking virtualenvwrapper.sh path for new host (Thu Jan 24 11:39:04 2019 +0000) <slackline>
| * | | d2ccb42 - Tweaking specifics for work machine. (Thu Jan 24 11:37:48 2019 +0000) <slackline>
| * | | 13fc696 - Updates to a few files (Thu Jan 24 07:09:41 2019 +0000) <slackline>
| * | | a453190 - Added gnupg to link section (Wed Jan 23 13:08:29 2019 +0000) <slackline>
| * | | c7da4ac - Added todo task and display of warnings at end of setup (Wed Jan 23 11:01:08 2019 +0000) <slackline>
| * | | 07f313b - Added gnupg (Wed Jan 23 10:58:22 2019 +0000) <slackline>
| * | | 20cf7f8 - Copying sample code from James Ridgway https://github.com/jamesridgway/dotfiles/blob/master/setup (Wed Jan 23 10:57:32 2019 +0000) <slackline>
| * | | 4edf7b5 - updated path for /mnt/personal (Sat Jan 12 07:55:45 2019 +0000) <slackline>
| * | | 0b61635 - adding work_laptop profile (Wed Dec 19 12:34:16 2018 +0000) <slackline>
| * | | 71c7d3f - Adding config/.config/.pycodestyle (Tue Dec 18 16:32:52 2018 +0000) <slackline>
| | |/
| |/|
| * | 8ac383a - (origin/kimura) Starting off yapf config (Mon Dec 17 17:15:42 2018 +0000) <slackline>
| |/
| * 3d6aac6 - Merge branch 'master' of gitlab.com:nshephard/dotfiles into kimura (Mon Dec 10 10:41:47 2018 +0000) <slackline>
个元素。我必须提取这些元素的文本内容,并从中创建新节点,然后将其他一些节点合并到新节点中。 secind问题是,输入xml中没有真正的一致性。我真的很沮丧。
(我正在处理的输入XML比给出的示例长,但它遵循相同的模式:一个div具有页面类,每个div具有两个内容和段落)
输入xml:
<p>
我想要获得的输出是这样:
<root>
<div class="page">
<p>Content:</p>
<p>This is the content. </p>
<p>Content continues. </p>
<p>End content.</p>
<p>Paragraph:</p>
<p>◼ Beginning of new paragraph. </p>
<p>End of new paragraph.</p>
<p>◼ New line here.</p>
<p>Content:</p>
<p>Heres lies the second content </p>
<p>Continiuation of the second content. </p>
<p>Second content ends.</p>
<p>Paragraph:</p>
<p>◼ Start of second paragraph. </p>
<p>Finish of second paragraph.</p>
<p>◼ This should also be separate.</p>
</div>
<div class="page">
<p>Content:</p>
<p>Third content starts here. </p>
<p>Third content continues. </p>
<p>End content three.</p>
<p>Paragraph:</p>
<p>◼ Beginning of third paragraph. </p>
<p>End of third paragraph.</p>
<p>◼ And again a new line.</p>
</div>
</root>
答案 0 :(得分:0)
我不确定所需的确切逻辑,但您可能想在此处使用xsl:for-each-group
。
因此,首先选择p
元素,然后将它们按以冒号结尾的元素分组
<xsl:for-each-group select="p" group-starting-with="p[ends-with(., ':')]">
然后,您可以使用current-group()
处理该组。但是,段落需要做更多的工作,因为您需要嵌套的xsl:for-each
来处理以该有趣符号开头的段落。
<xsl:for-each-group select="current-group() except ." group-starting-with="p[starts-with(., '◼')]">
尝试使用此XSLT
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:output method="xml" indent="yes" />
<xsl:strip-space elements="*" />
<xsl:template match="div[@class='page']">
<page>
<xsl:for-each-group select="p" group-starting-with="p[ends-with(., ':')]">
<xsl:choose>
<xsl:when test=". = 'Content:'">
<title><xsl:value-of select="." /></title>
<content>
<xsl:value-of select="current-group() except ." separator="" />
</content>
</xsl:when>
<xsl:when test=". = 'Paragraph:'">
<paragraph><xsl:value-of select="." /></paragraph>
<xsl:for-each-group select="current-group() except ." group-starting-with="p[starts-with(., '◼')]">
<pcontent>
<xsl:value-of select="current-group()" separator="" />
</pcontent>
</xsl:for-each-group>
</xsl:when>
</xsl:choose>
</xsl:for-each-group>
</page>
</xsl:template>
</xsl:stylesheet>