我需要索引5种不同类型的xml文件。它们具有相似的结构,每个结构略有不同。
示例1:
<?xml version="1.0"?>
<manifest>
<metadata>
<isbn>9780815341291</isbn>
<title>Essential Cell Biology,Third Edition</title>
<authors>
<author>Alberts;Bruce</author>
<author>Bray;Dennis</author>
</authors>
<categories>
<category>SCABC</category>
<category>SCDEF</category>
</categories>
</metadata>
<resources>
<audioresource>
<uuid>123456789</uuid>
<source>03_Mutations_Origin_Cancer.mp3</source>
<mimetype>audio/mpeg</mimetype>
<title>Part Three - Mutations and the Origin of Cancer</title>
<description>123</description>
<chapters>
<chapter>1</chapter>
</chapters>
</audioresource>
</resources>
</manifest>
示例2:
<?xml version="1.0"?>
<manifest>
<metadata>
<isbn>9780815341291</isbn>
<title>Essential Cell Biology,Third Edition</title>
<authors>
<author>FN:Alberts;Bruce</author>
<author>FN:Bray;Dennis</author>
</authors>
<categories>
<category>SCABC</category>
<category>SCGHI</category>
</categories>
</metadata>
<resources>
<glossaryresource>
<uuid>123456789</uuid>
<term>A subunit </term>
<definition>The portion of a bacterial exotoxin that interferes with normal host cell function. </definition>
<chapters>
<chapter>10</chapter>
</chapters>
</glossaryresource>
</resources>
</manifest>
我的dih-config.xml如下:
<dataConfig>
<dataSource name="fileReader" type="FileDataSource" encoding="UTF-8"/>
<document>
<entity name="dir" rootEntry="false" dataSource="null" processor="FileListEntityProcessor" fileName="^.*\.xml$" recursive="true" baseDir="X:/tmp/npr">
<entity name="audioresource"
rootEntity="true"
dataSource="fileReader"
url="${dir.fileAbsolutePath}"
stream="false"
logTemplate=" processing ${dir.fileAbsolutePath}"
logLevel="debug"
processor="XPathEntityProcessor"
forEach="/manifest/metadata | /manifest/metadata/authors | /manifest/metadata/categories | /manifest/metadata/resources | /manifest/resources/audioresource | /manifest/resources/audioresource/chapters"
transformer="DateFormatTransformer">
<field column="category" xpath="/manifest/metadata/categories/category" />
<field column="author" xpath="/manifest/metadata/authors/author" />
<field column="book_title" xpath="/manifest/metadata/title" />
<field column="isbn" xpath="/manifest/metadata/isbn"/>
<field column="id" xpath="/manifest/resources/audioresource/uuid"/>
<field column="mimetype" xpath="/manifest/resources/audioresource/mimetype" />
<field column="title" xpath="/manifest/resources/audioresource/title"/>
<field column="description" xpath="/manifest/resources/audioresource/description"/>
<field column="chapter" xpath="/manifest/resources/audioresource/chapters/chapter"/>
<field column="source" xpath="/manifest/resources/audioresource/source"/>
</entity>
</entity>
</document>
</dataConfig>
我对xpath不太熟悉。我不能在元素名称中使用通配符,可以吗?尝试过,它没有用。
非常感谢提前。
答案 0 :(得分:0)
我目前正在研究类似的问题。您是否尝试过创建XSLT? entity元素具有可选的“xsl”属性。