使用Perl脚本创建和填充复杂的XML

时间:2012-11-27 14:17:55

标签: xml perl

我有以下要创建的XML文件模板并使用Perl脚本填充。 XML属性的所有值都来自SQL数据库的不同查询。我的XML包含很少的属性集合类型。

我发现我应该使用哪个perl模块很困难,因为CPAN上有很多替代方案。 另外,我想知道如何解决这个问题。

非常感谢任何帮助。

`

<TumorDetails>
    <personUpi>String</personUpi>
    <ageAtDiagnosis>3.14159E0</ageAtDiagnosis>
    <biopsyPathologyReportSummary>String</biopsyPathologyReportSummary>
    <primarySiteCollection>
        <tissueSite>
            <description>String</description>
            <name>String</name>
        </tissueSite>
    </primarySiteCollection>
    <distantMetastasisSite>
        <description>String</description>
        <name>String</name>
    </distantMetastasisSite>
    <siteGroup>
        <description>String</description>
        <name>String</name>
    </siteGroup>
    <tmStaging>
        <clinicalDescriptor>String</clinicalDescriptor>
        <clinicalMStage>String</clinicalMStage>
        <siteGroupEdition5>
            <description>String</description>
            <name>String</name>
        </siteGroupEdition5>
        <siteGroupEdition6>
            <description>String</description>
            <name>String</name>
        </siteGroupEdition6>
    </tmStaging>
    <pediatricStaging>
        <doneBy>String</doneBy>
        <group>String</group>
    </pediatricStaging>
    <histologicTypeCollection>
        <histologicType>
            <description>String</description>
            <system>String</system>
            <value>String</value>
        </histologicType>
    </histologicTypeCollection>
    <histologicGradeCollection>
        <histologicGrade>
            <gradeOrDifferentiation>String</gradeOrDifferentiation>
        </histologicGrade>
    </histologicGradeCollection>
    <familyHistoryCollection>
        <familyHistory>
            <otherCancerDiagnosed>String</otherCancerDiagnosed>
            <sameCancerDiagnosed>String</sameCancerDiagnosed>
        </familyHistory>
    </familyHistoryCollection>
    <comorbidityOrComplicationCollection>
        <comorbidityOrComplication>
            <value>String</value>
        </comorbidityOrComplication>
    </comorbidityOrComplicationCollection>
    <tumorBiomarkerTest>
        <her2NeuDerived>String</her2NeuDerived>
        <her2NeuFish>String</her2NeuFish>
    </tumorBiomarkerTest>
    <patientHistoryCollection>
        <patientHistory>
            <cancerSite>String</cancerSite>
            <sequence>2147483647</sequence>
        </patientHistory>
    </patientHistoryCollection>
    <tumorHistory>
        <cancerStatus>String</cancerStatus>
        <cancerStatusFollowUpDate>1967-08-13</cancerStatusFollowUpDate>
        <cancerStatusFollowUpType>String</cancerStatusFollowUpType>
        <qualityOfSurvival>String</qualityOfSurvival>
    </tumorHistory>
    <placeOfDiagnosis>
        <initials>String</initials>
    </placeOfDiagnosis>
    <followUp>
        <dateFollowUpChanged>String</dateFollowUpChanged>
        <dateOfLastCancerStatus>1967-08-13</dateOfLastCancerStatus>
        <nextFollowUpHospital>
            <initials>String</initials>
        </nextFollowUpHospital>
        <lastFollowUpHospital>
            <initials>String</initials>
        </lastFollowUpHospital>
        <tumorFollowUpBiomarkerTest>
            <her2NeuDerived>String</her2NeuDerived>
            <her2NeuFish>String</her2NeuFish>
        </tumorFollowUpBiomarkerTest>
    </followUp>
</TumorDetails>

`

4 个答案:

答案 0 :(得分:3)

首先,非常非常重要的概念:没有“XML模板”这样的东西!使用XML的重点是能够根据某些模式读/写数据。 如果您有(一致的)XML示例但没有架构定义(XSD),请使用trang来计算出来:

java -jar trang.jar sample.xml sample.xsd

对于提供的示例,生成的XSD文件如下所示:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
  <xs:element name="TumorDetails">
    <xs:complexType>
      <xs:sequence>
        <xs:element ref="personUpi"/>
        <xs:element ref="ageAtDiagnosis"/>
        <xs:element ref="biopsyPathologyReportSummary"/>
        <xs:element ref="primarySiteCollection"/>
        <xs:element ref="distantMetastasisSite"/>
        <xs:element ref="siteGroup"/>
        <xs:element ref="tmStaging"/>
        <xs:element ref="pediatricStaging"/>
        <xs:element ref="histologicTypeCollection"/>
        <xs:element ref="histologicGradeCollection"/>
        <xs:element ref="familyHistoryCollection"/>
        <xs:element ref="comorbidityOrComplicationCollection"/>
        <xs:element ref="tumorBiomarkerTest"/>
        <xs:element ref="patientHistoryCollection"/>
        <xs:element ref="tumorHistory"/>
        <xs:element ref="placeOfDiagnosis"/>
        <xs:element ref="followUp"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
...
</xs:schema>

现在,最好的部分称为XML::Compile。 它采用您的XSD架构,编译它并适合/验证本机Perl结构,生成XML作为输出:

#!/usr/bin/env perl
use strict;
use warnings;
use XML::Compile::Schema;

my $node = {
    personUpi                    => 'String',
    ageAtDiagnosis               => '3.14159E0',
    biopsyPathologyReportSummary => 'String',
    primarySiteCollection        => {
        tissueSite => {
            description => 'String',
            name        => 'String',
        },
    },
    ...
};

my $schema = XML::Compile::Schema->new('sample.xsd');
my $writer = $schema->compile(WRITER => 'TumorDetails');
my $doc = XML::LibXML::Document->new(q(1.0), q(UTF-8));

print $writer->($doc, $node)->toString;

答案 1 :(得分:2)

很大程度上取决于您已经熟悉的内容。如果您习惯使用文档对象模型导航XML文档,那么XML::DOMXML::LibXMLXML::Twig很不错,而XML::TreeBuilder是一个具有自己的API的类似模块,并且你只会通过尝试来发现它是否适合你。

但是,所有这些模块都主要用于导航和访问现有的XML数据,它们对于从头创建新XML只有部分用处。相反,模块 XML::GeneratorXML::WriterXML::API 是专门为此目的而设计的,并且都具有类似的接口。我的偏好和我对你的推荐是XML::API,它具有最灵活的界面,应该很适合你的目的。

使用XML::API,生成给定XML文档的代码与生成的XML一一对应。每个语句对应一个XML元素或标记,标记和属性名称和文本值可以在运行时派生,例如使用数据库中的信息。

该程序重新创建示例XML。请注意,子节可以单独编码并分成子程序,将XML::API对象传递给每个子程序。也可以以非线性方式生成XML,因为每个方法都返回对创建的元素的引用,并且有一个_goto方法,它接受这样的引用并设置后续添加的位置。实际上,_close方法,而不是写任何数据,只是对当前元素的父级执行_goto

use strict;
use warnings;

use XML::API;

my $xml = XML::API->new(doctype => 'xhtml');

$xml->_open('TumorDetails');

  $xml->_element('personUpi', 'String');
  $xml->_element('ageAtDiagnosis', '3.14159E0');
  $xml->_element('biopsyPathologyReportSummary', 'String');

  $xml->_open('primarySiteCollection');
    $xml->_open('tissueSite');
      $xml->_element('description', 'String');
      $xml->_element('name', 'String');
    $xml->_close('tissueSite');
  $xml->_close('primarySiteCollection');

  $xml->_open('distantMetastasisSite');
    $xml->_element('description', 'String');
    $xml->_element('name', 'String');
  $xml->_close('distantMetastasisSite');

  $xml->_open('siteGroup');
    $xml->_element('description', 'String');
    $xml->_element('name', 'String');
  $xml->_close('siteGroup');

  $xml->_open('tmStaging');
    $xml->_element('clinicalDescriptor', 'String');
    $xml->_element('clinicalMStage', 'String');
    $xml->_open('siteGroupEdition5');
      $xml->_element('description', 'String');
      $xml->_element('name', 'String');
    $xml->_close('siteGroupEdition5');
    $xml->_open('siteGroupEdition6');
      $xml->_element('description', 'String');
      $xml->_element('name', 'String');
    $xml->_close('siteGroupEdition6');
  $xml->_close('tmStaging');

  $xml->_open('pediatricStaging');
    $xml->_element('doneBy', 'String');
    $xml->_element('group', 'String');
  $xml->_close('pediatricStaging');

  $xml->_open('histologicTypeCollection');
    $xml->_open('histologicType');
      $xml->_element('description', 'String');
      $xml->_element('system', 'String');
      $xml->_element('value', 'String');
    $xml->_close('histologicType');
  $xml->_close('histologicTypeCollection');

  $xml->_open('histologicGradeCollection');
    $xml->_open('histologicGrade');
      $xml->_element('gradeOrDifferentiation', 'String');
    $xml->_close('histologicGrade');
  $xml->_close('histologicGradeCollection');

  $xml->_open('familyHistoryCollection');
    $xml->_open('familyHistory');
      $xml->_element('otherCancerDiagnosed', 'String');
      $xml->_element('sameCancerDiagnosed', 'String');
    $xml->_close('familyHistory');
  $xml->_close('familyHistoryCollection');

  $xml->_open('comorbidityOrComplicationCollection');
    $xml->_open('comorbidityOrComplication');
      $xml->_element('value', 'String');
    $xml->_close('comorbidityOrComplication');
  $xml->_close('comorbidityOrComplicationCollection');

  $xml->_open('tumorBiomarkerTest');
    $xml->_element('her2NeuDerived', 'String');
    $xml->_element('her2NeuFish', 'String');
  $xml->_close('tumorBiomarkerTest');

  $xml->_open('patientHistoryCollection');
    $xml->_open('patientHistory');
      $xml->_element('cancerSite', 'String');
      $xml->_element('sequence', '2147483647');
    $xml->_close('patientHistory');
  $xml->_close('patientHistoryCollection');

  $xml->_open('tumorHistory');
    $xml->_element('cancerStatus', 'String');
    $xml->_element('cancerStatusFollowUpDate', '1967-08-13');
    $xml->_element('cancerStatusFollowUpType', 'String');
    $xml->_element('qualityOfSurvival', 'String');
  $xml->_close('tumorHistory');

  $xml->_open('placeOfDiagnosis');
    $xml->_element('initials', 'String');
  $xml->_close('placeOfDiagnosis');

  $xml->_open('followUp');
    $xml->_element('dateFollowUpChanged', 'String');
    $xml->_element('dateOfLastCancerStatus', '1967-08-13');
    $xml->_open('nextFollowUpHospital');
      $xml->_element('initials', 'String');
    $xml->_close('nextFollowUpHospital');
    $xml->_open('lastFollowUpHospital');
      $xml->_element('initials', 'String');
    $xml->_close('lastFollowUpHospital');
    $xml->_open('tumorFollowUpBiomarkerTest');
      $xml->_element('her2NeuDerived', 'String');
      $xml->_element('her2NeuFish', 'String');
    $xml->_close('tumorFollowUpBiomarkerTest');
  $xml->_close('followUp');

$xml->_close('TumorDetails');

print $xml;

答案 2 :(得分:1)

如果数据总是相同,那么ddoxey的TemplateToolkit解决方案是一个很好的解决方案,但是,如果某些标签有时不存在,那么每次都需要从头开始构建XML。

我最近使用XML做了一些工作,对XML::Writer非常满意。

答案 3 :(得分:0)

我对模板工具包有点偏爱。 见:

#!/usr/bin/perl -Tw

use strict;
use warnings;
use Template;

my $tmpl = get_template();
my $rec  = get_record();
my $xml;

my $template = Template->new();

$template->process( \$tmpl, $rec, \$xml )
    || die $template->error();

print "$xml";

# ...

sub get_record {

    return {
        personUpi                    => 'String',
        ageAtDiagnosis               => '3.14159E0',
        biopsyPathologyReportSummary => 'String',
        primarySiteCollection        => {
            tissueSite => {
                description => 'String',
                name        => 'String',
            },
        },
        distantMetastasisSite => {
            description => 'String',
            name        => 'String',
        },
        siteGroup => {
            description => 'String',
            name        => 'String',
        },
        tmStaging => {
            clinicalDescriptor => 'String',
            clinicalMStage     => 'String',
            siteGroupEdition5  => {
                description => 'String',
                name        => 'String',
            },
            siteGroupEdition6 => {
                description => 'String',
                name        => 'String',
            },
        },
        pediatricStaging => {
            doneBy => 'String',
            group  => 'String',
        },
        histologicTypeCollection => {
            histologicType => {
                description => 'String',
                system      => 'String',
                value       => 'String',
            },
        },
        histologicGradeCollection => {
            histologicGrade => { gradeOrDifferentiation => 'String', }, },
        familyHistoryCollection => {
            familyHistory => {
                otherCancerDiagnosed => 'String',
                sameCancerDiagnosed  => 'String',
            },
        },
        comorbidityOrComplicationCollection => {
            comorbidityOrComplicationCollection => { value => 'String', },
        },
        tumorBiomarkerTest => {
            her2NeuDerived => 'String',
            her2NeuFish    => 'String',
        },
        patientHistoryCollection => {
            patientHistory => {
                cancerSite => 'String',
                sequence   => '2147483647',
            },
        },
        tumorHistory => {
            cancerStatus             => 'String',
            cancerStatusFollowUpDate => '1967-08-13',
            cancerStatusFollowUpType => 'String',
            qualityOfSurvival        => 'String',
        },
        placeOfDiagnosis => { initials => 'String', },
        followUp         => {
            dateFollowUpChanged        => 'String',
            dateOfLastCancerStatus     => '1967-08-13',
            nextFollowUpHospital       => { initials => 'String', },
            lastFollowUpHospital       => { initials => 'String', },
            tumorFollowUpBiomarkerTest => {
                her2NeuDerived => 'String',
                her2NeuFish    => 'String',
            },
        },
    };
}

sub get_template {

    return <<'END_TEMPL';
<TumorDetails>
    <personUpi>[% personUpi %]</personUpi>
    <ageAtDiagnosis>[% ageAtDiagnosis %]</ageAtDiagnosis>
    <biopsyPathologyReportSummary>[% biopsyPathologyReportSummary %]</biopsyPathologyReportSummary>
    <primarySiteCollection>
        <tissueSite>
            <description>[% primarySiteCollection.tissueSite.description %]</description>
            <name>[% primarySiteCollection.tissueSite.name %]</name>
        </tissueSite>
    </primarySiteCollection>
    <distantMetastasisSite>
        <description>[% distantMetastasisSite.description %]</description>
        <name>[% distantMetastasisSite.name %]</name>
    </distantMetastasisSite>
    <siteGroup>
        <description>[% siteGroup.description %]</description>
        <name>[% siteGroup.name %]</name>
    </siteGroup>
    <tmStaging>
        <clinicalDescriptor>[% tmStaging.clinicalDescriptor %]</clinicalDescriptor>
        <clinicalMStage>[% tmStaging.clinicalMStage %]</clinicalMStage>
        <siteGroupEdition5>
            <description>[% tmStaging.siteGroupEdition5.description %]</description>
            <name>[% tmStaging.siteGroupEdition5.name %]</name>
        </siteGroupEdition5>
        <siteGroupEdition6>
            <description>[% tmStaging.siteGroupEdition6.description %]</description>
            <name>[% tmStaging.siteGroupEdition6.name %]</name>
        </siteGroupEdition6>
    </tmStaging>
    <pediatricStaging>
        <doneBy>[% pediatricStaging.doneBy %]</doneBy>
        <group>[% pediatricStaging.group %]</group>
    </pediatricStaging>
    <histologicTypeCollection>
        <histologicType>
            <description>[% histologicTypeCollection.histologicType.description %]</description>
            <system>[% histologicTypeCollection.histologicType.system %]</system>
            <value>[% histologicTypeCollection.histologicType.value %]</value>
        </histologicType>
    </histologicTypeCollection>
    <histologicGradeCollection>
        <histologicGrade>
            <gradeOrDifferentiation>[% histologicGradeCollection.histologicGrade.gradeOrDifferentiation %]</gradeOrDifferentiation>
        </histologicGrade>
    </histologicGradeCollection>
    <familyHistoryCollection>
        <familyHistory>
            <otherCancerDiagnosed>[% familyHistoryCollection.familyHistory.otherCancerDiagnosed %]</otherCancerDiagnosed>
            <sameCancerDiagnosed>[% familyHistoryCollection.familyHistory.sameCancerDiagnosed %]</sameCancerDiagnosed>
        </familyHistory>
    </familyHistoryCollection>
    <comorbidityOrComplicationCollection>
        <comorbidityOrComplication>
            <value>[% comorbidityOrComplicationCollection.comorbidityOrComplicationCollection.value %]</value>
        </comorbidityOrComplication>
    </comorbidityOrComplicationCollection>
    <tumorBiomarkerTest>
        <her2NeuDerived>[% tumorBiomarkerTest.her2NeuDerived %]</her2NeuDerived>
        <her2NeuFish>[% tumorBiomarkerTest.her2NeuFish %]</her2NeuFish>
    </tumorBiomarkerTest>
    <patientHistoryCollection>
        <patientHistory>
            <cancerSite>[% patientHistoryCollection.patientHistory.cancerSite %]</cancerSite>
            <sequence>[% patientHistoryCollection.patientHistory.sequence %]</sequence>
        </patientHistory>
    </patientHistoryCollection>
    <tumorHistory>
        <cancerStatus>[% tumorHistory.cancerStatus %]</cancerStatus>
        <cancerStatusFollowUpDate>[% tumorHistory.cancerStatusFollowUpDate %]</cancerStatusFollowUpDate>
        <cancerStatusFollowUpType>[% tumorHistory.cancerStatusFollowUpType %]</cancerStatusFollowUpType>
        <qualityOfSurvival>[% tumorHistory.qualityOfSurvival %]</qualityOfSurvival>
    </tumorHistory>
    <placeOfDiagnosis>
        <initials>[% placeOfDiagnosis.initials %]</initials>
    </placeOfDiagnosis>
    <followUp>
        <dateFollowUpChanged>[% followUp.dateFollowUpChanged %]</dateFollowUpChanged>
        <dateOfLastCancerStatus>[% followUp.dateOfLastCancerStatus %]</dateOfLastCancerStatus>
        <nextFollowUpHospital>
            <initials>[% followUp.nextFollowUpHospital.initials %]</initials>
        </nextFollowUpHospital>
        <lastFollowUpHospital>
            <initials>[% followUp.nextFollowUpHospital.initials %]</initials>
        </lastFollowUpHospital>
        <tumorFollowUpBiomarkerTest>
            <her2NeuDerived>[% followUp.tumorFollowUpBiomarkerTest.her2NeuDerived %]</her2NeuDerived>
            <her2NeuFish>[% followUp.tumorFollowUpBiomarkerTest.her2NeuFish %]</her2NeuFish>
        </tumorFollowUpBiomarkerTest>
    </followUp>
</TumorDetails>
END_TEMPL
}

__END__