xs到csv与xslt

时间:2016-02-16 16:08:05

标签: xml csv xslt omeka

我正在尝试将Omeka XML输出转换为CSV,并且遇到使XSLT正确的问题。以下是原始XML输出的片段:

<item itemId="2203" public="1" featured="1">
    <fileContainer>
      <file fileId="3887">
        <src>http://bodies.haa.pitt.edu/files/original/bfbf8515f3b05eca4c506111da66b325.jpg</src>
        <authentication>64676f4fee1b283036fff06961129793</authentication>
      </file>
      <file fileId="3888">
        <src>http://bodies.haa.pitt.edu/files/original/c2690cfd10eeffa32697917df48c9e08.jpg</src>
        <authentication>f21fc08344f0b5e34006e6af2f699bd8</authentication>
      </file>
    </fileContainer>
    <collection collectionId="2">
      <elementSetContainer>
        <elementSet elementSetId="1">
          <name>Dublin Core</name>
          <description>The Dublin Core metadata element set is common to all Omeka records, including items, files, and collections. For more information see, http://dublincore.org/documents/dces/.</description>
          <elementContainer>
            <element elementId="50">
              <name>Title</name>
              <description>A name given to the resource</description>
              <elementTextContainer>
                <elementText elementTextId="62079">
                  <text>Ohio Penitentiary, State Archives Series 1002AV, "Bertillon cards with photographs [graphic], 1888-1919"</text>
                </elementText>
              </elementTextContainer>
            </element>
            <element elementId="41">
              <name>Description</name>
              <description>An account of the resource</description>
              <elementTextContainer>
                <elementText elementTextId="62080">
                  <text>Earliest series of OP cards. </text>
                </elementText>
              </elementTextContainer>
            </element>
            <element elementId="39">
              <name>Creator</name>
              <description>An entity primarily responsible for making the resource</description>
              <elementTextContainer>
                <elementText elementTextId="62081">
                  <text>Ohio Penitentiary </text>
                </elementText>
              </elementTextContainer>
            </element>
            <element elementId="48">
              <name>Source</name>
              <description>A related resource from which the described resource is derived</description>
              <elementTextContainer>
                <elementText elementTextId="62082">
                  <text>1888-1919</text>
                </elementText>
              </elementTextContainer>
            </element>
            <element elementId="43">
              <name>Identifier</name>
              <description>An unambiguous reference to the resource within a given context</description>
              <elementTextContainer>
                <elementText elementTextId="62083">
                  <text>OP1002AV</text>
                </elementText>
              </elementTextContainer>
            </element>
          </elementContainer>
        </elementSet>
      </elementSetContainer>
    </collection>
    <itemType itemTypeId="24">
      <name>Type 1 OP Bertillon Card Form: 1880s Card</name>
      <description>If you do not have any data to enter into an element field please enter n/a</description>
      <elementContainer>
        <element elementId="101">
          <name>Measured at</name>
          <description>This is a date field. If it says anything other than "Ohio Penitentiary" or an abbreviation of that name, mark this record "irregular" below. Please enter the date in the following format: 1902-03-31 for March 31, 1902. If no month or day is provided, please just enter the year (e.g. 1888).</description>
          <elementTextContainer>
            <elementText elementTextId="53093">
              <text>Ohio Penitentiary</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="104">
          <name>Measured by</name>
          <description>Name of Bertillon Officer. Often W.B. Cherington.</description>
          <elementTextContainer>
            <elementText elementTextId="53094">
              <text>illegible</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="60">
          <name>Height</name>
          <description>Please enter entire height, in the following format 171.1 for 1m 71.1cm</description>
          <elementTextContainer>
            <elementText elementTextId="53095">
              <text>169.5</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="77">
          <name>Height 2</name>
          <description>Please enter entire height written in red, in the following format 171.1 for 1m 71.1cm</description>
          <elementTextContainer>
            <elementText elementTextId="53096">
              <text>n/a</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="61">
          <name>Stoop</name>
          <description>Usually "English Height." Please enter height, in the following format: 5-8 for 5 ft 8 inches. Mark this record "irregular" below if this field contains anything else but the prisoner's height in feet and inches.</description>
          <elementTextContainer>
            <elementText elementTextId="53097">
              <text>n/a</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="78">
          <name>Stoop 2</name>
          <description>Usually "English Height," written in red. Please enter height, in the following format: 5-8 for 5 ft 8 inches. Mark this record "irregular" below if this field contains anything else but the prisoner's height in feet and inches.</description>
          <elementTextContainer>
            <elementText elementTextId="53098">
              <text>n/a</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="62">
          <name>Outs. A.</name>
          <description>"Outside Arm," a.k.a., "Reach." Please enter height, in the following format: 176.0 for 1m 76.0cm.</description>
          <elementTextContainer>
            <elementText elementTextId="53099">
              <text>182</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="79">
          <name>Outs. A. 2</name>
          <description>"Outside Arm," a.k.a., "Reach." Please enter height, in the following format: 176.0 for 1m 76.0cm. Fill in if a second set of data appears in red.</description>
          <elementTextContainer>
            <elementText elementTextId="53100">
              <text>n/a</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="63">
          <name>Trunk</name>
          <description>"Height of a man seated"</description>
          <elementTextContainer>
            <elementText elementTextId="53101">
              <text>86.1</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="80">
          <name>Trunk 2</name>
          <description>"Height of a man seated." Fill in if a second set of measurements is provided in red.</description>
          <elementTextContainer>
            <elementText elementTextId="53102">
              <text>n/a</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="102">
          <name>Head, lgth</name>
          <description>"Head length"</description>
          <elementTextContainer>
            <elementText elementTextId="53103">
              <text>19.1</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="103">
          <name>Head, lgth 2</name>
          <description>"Head length." Fill in if a second set of measurements is provided in red.</description>
          <elementTextContainer>
            <elementText elementTextId="53104">
              <text>n/a</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="65">
          <name>" wdth</name>
          <description>"Head width"</description>
          <elementTextContainer>
            <elementText elementTextId="53105">
              <text>14.6</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="83">
          <name>" wdth 2</name>
          <description>"Head width." Fill in if a second set of measurements is provided in red.</description>
          <elementTextContainer>
            <elementText elementTextId="53106">
              <text>n/a</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="66">
          <name>R. Ear lgth</name>
          <description>"Right ear length"</description>
          <elementTextContainer>
            <elementText elementTextId="53107">
              <text>6.4</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="82">
          <name>R. Ear lgth 2</name>
          <description>"Right ear length." Fill in if a second set of measurements is provided in red.</description>
          <elementTextContainer>
            <elementText elementTextId="53108">
              <text>n/a</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="67">
          <name>R. Ear wdth</name>
          <description>Usually "Right cheek width," which is denoted by the Bertillon Officer writing in "chk" over the abbreviation "wdth." Mark this record "irregular" below if this field contains anything else but an apparent measurement of the right cheek width.</description>
          <elementTextContainer>
            <elementText elementTextId="53109">
              <text>n/a</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="84">
          <name>R. Ear wdth 2</name>
          <description>Usually "Right check width," which is denoted by the Bertillon Officer writing in "chk" over the abbreviation "wdth." Mark this record "irregular below if this field contains anything else but an apparent measurement of the right check width. Fill in if a second set of measurements is provided in red.</description>
          <elementTextContainer>
            <elementText elementTextId="53110">
              <text>n/a</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="68">
          <name>L. Foot</name>
          <description>"Left foot length"</description>
          <elementTextContainer>
            <elementText elementTextId="53111">
              <text>27.8</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="85">
          <name>L. Foot 2</name>
          <description>"Left foot length." Fill in if a second set of measurements is provided in red.</description>
          <elementTextContainer>
            <elementText elementTextId="53112">
              <text>n/a</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="69">
          <name>L. Mid F</name>
          <description>"Left middle finger"</description>
          <elementTextContainer>
            <elementText elementTextId="53113">
              <text>12.4</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="86">
          <name>L. Mid F 2</name>
          <description>"Left middle finger." Fill in if a second set of measurements is provided in red.</description>
          <elementTextContainer>
            <elementText elementTextId="53114">
              <text>n/a</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="70">
          <name>L. Lit. F</name>
          <description>"Left little finger"</description>
          <elementTextContainer>
            <elementText elementTextId="53115">
              <text>9.9</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="87">
          <name>L. Lit. F 2</name>
          <description>"Left little finger." Fill in if a second set of measurements is provided in red.</description>
          <elementTextContainer>
            <elementText elementTextId="53116">
              <text>n/a</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="71">
          <name>L Fore A</name>
          <description>"Left forearm"</description>
          <elementTextContainer>
            <elementText elementTextId="53117">
              <text>48.5</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="88">
          <name>L Fore A 2</name>
          <description>"Left forearm." Fill in if a second set of measurements is provided in red.</description>
          <elementTextContainer>
            <elementText elementTextId="53118">
              <text>n/a</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="59">
          <name>Prisoner Number</name>
          <description>Found on the prisoner in the photograph.</description>
          <elementTextContainer>
            <elementText elementTextId="53119">
              <text>19148</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="74">
          <name>Anomaly Marker</name>
          <description>Options: irregular, damaged</description>
          <elementTextContainer>
            <elementText elementTextId="53120">
              <text>irregular</text>
            </elementText>
          </elementTextContainer>
        </element>
        <element elementId="76">
          <name>Cataloguer's Notes</name>
          <description>Feel free to wax poetic about anything you notice, here.</description>
          <elementTextContainer>
            <elementText elementTextId="53121">
              <text>the name of the measurer is illegible and the right edge of the card is ripped</text>
            </elementText>
          </elementTextContainer>
        </element>
      </elementContainer>
    </itemType>
    <elementSetContainer>
      <elementSet elementSetId="1">
        <name>Dublin Core</name>
        <description>The Dublin Core metadata element set is common to all Omeka records, including items, files, and collections. For more information see, http://dublincore.org/documents/dces/.</description>
        <elementContainer>
          <element elementId="50">
            <name>Title</name>
            <description>A name given to the resource</description>
            <elementTextContainer>
              <elementText elementTextId="52762">
                <text>Prisoner 19148</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="39">
            <name>Creator</name>
            <description>An entity primarily responsible for making the resource</description>
            <elementTextContainer>
              <elementText elementTextId="52763">
                <text>Ohio Penitentiary</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="48">
            <name>Source</name>
            <description>A related resource from which the described resource is derived</description>
            <elementTextContainer>
              <elementText elementTextId="52764">
                <text>Ohio History Connection, Columbus, OH</text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="38">
            <name>Coverage</name>
            <description>The spatial or temporal topic of the resource, the spatial applicability of the resource, or the jurisdiction under which the resource is relevant</description>
            <elementTextContainer>
              <elementText elementTextId="52765">
                <text>Series 1002AV, Box 4040, Folder 1888: 19146-19192 </text>
              </elementText>
            </elementTextContainer>
          </element>
          <element elementId="43">
            <name>Identifier</name>
            <description>An unambiguous reference to the resource within a given context</description>
            <elementTextContainer>
              <elementText elementTextId="52766">
                <text>1002AV_19148</text>
              </elementText>
            </elementTextContainer>
          </element>
        </elementContainer>
      </elementSet>
    </elementSetContainer>
    <tagContainer>
      <tag tagId="1">
        <name>pass1a</name>
      </tag>
    </tagContainer>
  </item>

我希望输出为:

fileID,Title,Description,Creator,Source “[文件名]”,“俄亥俄州监狱状态档案系列1002AV Bertillon卡带照片[图文] 1888-1919”,“最早的OP卡系列。”,“俄亥俄州监狱”,“1888-1919”

我目前正在使用以下XSLT

<?xml version="1.0" encoding="UTF-8"?>
    <xsl:stylesheet version="1.0"
        xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
        <xsl:output method="text" encoding="iso-8859-1"/>
        
        <xsl:strip-space elements="*" />
        
        <xsl:template match="/*/child::*">
            <xsl:for-each select="child::*">
                <xsl:if test="position() != last()">"<xsl:value-of select="normalize-space(.)"/>",    </xsl:if>
                <xsl:if test="position()  = last()">"<xsl:value-of select="normalize-space(.)"/>"<xsl:text>&#xD;</xsl:text>
                </xsl:if>
            </xsl:for-each>
        </xsl:template>
        
    </xsl:stylesheet>

获得以下输出:

“[filename]”,“都柏林核心都柏林核心元数据元素集对所有Omeka记录都是通用的,包括项目,文件和集合。有关详细信息,请参阅[DC网站] .Title给予resourceOhio Penitentiary的名称,国家档案馆系列1002AV,“Bertillon卡片带照片[图],1888-1919”描述资源最早的OP卡系列账号.CreatorAn实体主要负责制作资源俄罗斯监狱SourceA相关资源,其中描述的资源来源1888-1919IdentifierAn明确参考给定上下文中的资源OP1002AV“,”Type 1 OP Bertillon卡表:1880年代卡

基本上,我试图找出如何修改XSLT,以便我只获取text属性中的内容,而不是描述或名称属性中的内容。名称将是列标题。

非常感谢任何帮助!

1 个答案:

答案 0 :(得分:0)

由于您的xml文件只包含一个项目,因此必须使用xml文档的实际数据计算每个项目。对于该单条记录,您可以尝试以下转换:

<xsl:output method="text" encoding="iso-8859-1" />

<xsl:strip-space elements="*" />

<xsl:template match="item">
    <xsl:value-of select="fileContainer/file/@fileId" />
    <xsl:text>,</xsl:text>
    <xsl:value-of select="collection/elementSetContainer/elementSet/elementContainer/element[name='Title']/elementTextContainer/elementText/text" />
    <xsl:text>,</xsl:text>
    <xsl:value-of select="collection/elementSetContainer/elementSet/elementContainer/element[name='Description']/elementTextContainer/elementText/text" />
    <xsl:text>,</xsl:text>
    <xsl:value-of select="collection/elementSetContainer/elementSet/elementContainer/element[name='Creator']/elementTextContainer/elementText/text" />
    <xsl:text>,</xsl:text>
    <xsl:value-of select="collection/elementSetContainer/elementSet/elementContainer/element[name='Source']/elementTextContainer/elementText/text" />
</xsl:template>