通过SSIS加载XML - 此XML文档错误中禁止DTD

时间:2010-07-28 06:50:42

标签: sql-server ssis xsd dtd xml-parsing

希望你们都做得很好。

之前我问了一个问题how to import a XML file to SQL Server,感谢您的回复。

由于我的源文件包含大量数据,我正在尝试通过SSIS加载。以下是我遵循的步骤:

  1. BulkLoad将XML导入XML类型列
  2. 从SQL Server
  3. 中的XML文件中创建XSD架构
  4. 现在在SSIS中,使用了XML Source并提供了用于映射到OLEDB目标的XML模式。
  5. 但执行失败说

      

    “错误:加载XML的0xC02090E7,XML源1:组件”XML Source“   (1)无法读取XML数据。   此XML中禁止使用DTD   文件“。

    BOL说SSIS不支持DTD,我们无法在源文件中避免使用DTD。

    请有人帮我解决此问题。

    这是我的XML文件:

    <?xml version="1.0" encoding="UTF-8"?>
    <!DOCTYPE _line_feed [
    <!ELEMENT FeedTime (#PCDATA)>
    <!ELEMENT lastContest (#PCDATA)>
    <!ELEMENT lastGame (#PCDATA)>
    <!ELEMENT contest_maximum (#PCDATA)>
    <!ELEMENT contestantnum (#PCDATA)>
    <!ELEMENT description (#PCDATA)>
    <!ELEMENT event (event_datetimeGMT, gamenumber, sporttype, league, contest_maximum?, description?, (participants |  periods | total)+)>
    <!ELEMENT event_datetimeGMT (#PCDATA)>
    <!ELEMENT gamenumber (#PCDATA)>
    <!ELEMENT league (#PCDATA)>
    <!ELEMENT odds (moneyline_value, to_base?)>
    <!ELEMENT over_adjust (#PCDATA)>
    <!ELEMENT participants (participant*)>
    <!ELEMENT participant (participant_name, contestantnum, rotnum, visiting_home_draw?, odds?, pitcher?)>
    <!ELEMENT participant_name (#PCDATA)>
    <!ELEMENT periods (period*)>
    <!ELEMENT period (period_number, period_description, periodcutoff_datetimeGMT, period_status, period_update, spread_maximum?, moneyline_maximum?, total_maximum?, moneyline?, spread?, total?)>
    <!ELEMENT period_number (#PCDATA)>
    <!ELEMENT period_description (#PCDATA)>
    <!ELEMENT period_status (#PCDATA)>
    <!ELEMENT period_update (#PCDATA)>
    <!ELEMENT periodcutoff_datetimeGMT (#PCDATA)>
    <!ELEMENT _line_feed (FeedTime, lastContest, lastGame, events)>
    <!ELEMENT events (event*)>
    <!ELEMENT pitcher (#PCDATA)>
    <!ELEMENT rotnum (#PCDATA)>
    <!ELEMENT sporttype (#PCDATA)>
    <!ELEMENT moneyline (moneyline_visiting, moneyline_home, moneyline_draw?)>
    <!ELEMENT moneyline_value (#PCDATA)>
    <!ELEMENT moneyline_visiting (#PCDATA)>
    <!ELEMENT moneyline_home (#PCDATA)>
    <!ELEMENT moneyline_draw (#PCDATA)>
    <!ELEMENT moneyline_maximum (#PCDATA)>
    <!ELEMENT spread (spread_visiting, spread_adjust_visiting, spread_home, spread_adjust_home)>
    <!ELEMENT spread_adjust_home (#PCDATA)>
    <!ELEMENT spread_adjust_visiting (#PCDATA)>
    <!ELEMENT spread_home (#PCDATA)>
    <!ELEMENT spread_maximum (#PCDATA)>
    <!ELEMENT spread_visiting (#PCDATA)>
    <!ELEMENT to_base (#PCDATA)>
    <!ELEMENT total (total_points, over_adjust?, under_adjust?, units?)>
    <!ELEMENT total_maximum (#PCDATA)>
    <!ELEMENT total_points (#PCDATA)>
    <!ELEMENT under_adjust (#PCDATA)>
    <!ELEMENT units (#PCDATA)>
    <!ELEMENT visiting_home_draw (#PCDATA)>
    ]>
    
    <_line_feed>
        <FeedTime>1279783821193</FeedTime>
        <lastContest>4118567</lastContest>
        <lastGame>58681915</lastGame>
    <events>
    <event>
        <event_datetimeGMT>2010-07-22 20:05</event_datetimeGMT>
        <gamenumber>174201668</gamenumber>
        <sporttype>Tennis</sporttype>
        <league>M Atlanta 16</league>
        <participants>
            <participant>
                <participant_name>A. Roddick</participant_name>
                <contestantnum>4333</contestantnum>
                <rotnum>4333</rotnum>
                <visiting_home_draw>Visiting</visiting_home_draw>
            </participant>
            <participant>
                <participant_name>R. Ram</participant_name>
                <contestantnum>4334</contestantnum>
                <rotnum>4334</rotnum>
                <visiting_home_draw>Home</visiting_home_draw>
            </participant>
        </participants>
        <periods>
            <period>
                <period_number>0</period_number>
                <period_description>Game</period_description>
                <periodcutoff_datetimeGMT>2010-07-22 20:05</periodcutoff_datetimeGMT>
                <period_status>I</period_status>
                <period_update>open</period_update>
                <spread_maximum>500</spread_maximum>
                <moneyline_maximum>1500</moneyline_maximum>
                <total_maximum>500</total_maximum>
                <moneyline>
                    <moneyline_visiting>-1850</moneyline_visiting>
                    <moneyline_home>1290</moneyline_home>
                </moneyline>
            </period>
            <period>
                <period_number>0</period_number>
                <period_description>Game</period_description>
                <periodcutoff_datetimeGMT>2010-07-22 20:05</periodcutoff_datetimeGMT>
                <period_status>O</period_status>
                <period_update>open</period_update>
                <spread_maximum>500</spread_maximum>
                <moneyline_maximum>2000</moneyline_maximum>
                <total_maximum>500</total_maximum>
                <spread>
                    <spread_visiting>-5.5</spread_visiting>
                    <spread_adjust_visiting>-124</spread_adjust_visiting>
                    <spread_home>5.5</spread_home>
                    <spread_adjust_home>106</spread_adjust_home>
                </spread>
                <total>
                    <total_points>19.5</total_points>
                    <over_adjust>113</over_adjust>
                    <under_adjust>-132</under_adjust>
                </total>
            </period>
            <period>
                <period_number>1</period_number>
                <period_description>1st Set</period_description>
                <periodcutoff_datetimeGMT>2010-07-22 20:05</periodcutoff_datetimeGMT>
                <period_status>I</period_status>
                <period_update>open</period_update>
                <spread_maximum>5000</spread_maximum>
                <moneyline_maximum>250</moneyline_maximum>
                <total_maximum>5000</total_maximum>
                <moneyline>
                    <moneyline_visiting>-675</moneyline_visiting>
                    <moneyline_home>497</moneyline_home>
                </moneyline>
            </period>
        </periods>
    </event>
    <event>
        <event_datetimeGMT>2010-07-22 20:05</event_datetimeGMT>
        <gamenumber>174263209</gamenumber>
        <sporttype>Tennis</sporttype>
        <league>M Atlanta 16</league>
        <participants>
            <participant>
                <participant_name>I. Marchenko</participant_name>
                <contestantnum>4335</contestantnum>
                <rotnum>4335</rotnum>
                <visiting_home_draw>Visiting</visiting_home_draw>
            </participant>
            <participant>
                <participant_name>X. Malisse</participant_name>
                <contestantnum>4336</contestantnum>
                <rotnum>4336</rotnum>
                <visiting_home_draw>Home</visiting_home_draw>
            </participant>
        </participants>
        <periods>
            <period>
                <period_number>0</period_number>
                <period_description>Game</period_description>
                <periodcutoff_datetimeGMT>2010-07-22 20:05</periodcutoff_datetimeGMT>
                <period_status>O</period_status>
                <period_update>open</period_update>
                <spread_maximum>500</spread_maximum>
                <moneyline_maximum>2000</moneyline_maximum>
                <total_maximum>500</total_maximum>
                <moneyline>
                    <moneyline_visiting>139</moneyline_visiting>
                    <moneyline_home>-151</moneyline_home>
                </moneyline>
            </period>
            <period>
                <period_number>0</period_number>
                <period_description>Game</period_description>
                <periodcutoff_datetimeGMT>2010-07-22 20:05</periodcutoff_datetimeGMT>
                <period_status>O</period_status>
                <period_update>open</period_update>
                <spread_maximum>500</spread_maximum>
                <moneyline_maximum>2000</moneyline_maximum>
                <total_maximum>500</total_maximum>
                <spread>
                    <spread_visiting>2</spread_visiting>
                    <spread_adjust_visiting>100</spread_adjust_visiting>
                    <spread_home>-2</spread_home>
                    <spread_adjust_home>-117</spread_adjust_home>
                </spread>
                <total>
                    <total_points>22.5</total_points>
                    <over_adjust>-108</over_adjust>
                    <under_adjust>-108</under_adjust>
                </total>
            </period>
            <period>
                <period_number>1</period_number>
                <period_description>1st Set</period_description>
                <periodcutoff_datetimeGMT>2010-07-22 20:05</periodcutoff_datetimeGMT>
                <period_status>O</period_status>
                <period_update>open</period_update>
                <spread_maximum>5000</spread_maximum>
                <moneyline_maximum>500</moneyline_maximum>
                <total_maximum>5000</total_maximum>
                <moneyline>
                    <moneyline_visiting>115</moneyline_visiting>
                    <moneyline_home>-134</moneyline_home>
                </moneyline>
            </period>
        </periods>
    </event>
    <event>
        <event_datetimeGMT>2010-07-22 21:30</event_datetimeGMT>
        <gamenumber>174271178</gamenumber>
        <sporttype>Tennis</sporttype>
        <league>M Atlanta 16</league>
        <participants>
            <participant>
                <participant_name>K. Andersoõn</participant_name>
                <contestantnum>4341</contestantnum>
                <rotnum>4341</rotnum>
                <visiting_home_draw>Visiting</visiting_home_draw>
            </participant>
            <participant>
                <participant_name>D. Young</participant_name>
                <contestantnum>4342</contestantnum>
                <rotnum>4342</rotnum>
                <visiting_home_draw>Home</visiting_home_draw>
            </participant>
        </participants>
        <periods>
            <period>
                <period_number>0</period_number>
                <period_description>Game</period_description>
                <periodcutoff_datetimeGMT>2010-07-22 21:30</periodcutoff_datetimeGMT>
                <period_status>O</period_status>
                <period_update>open</period_update>
                <spread_maximum>500</spread_maximum>
                <moneyline_maximum>2000</moneyline_maximum>
                <total_maximum>500</total_maximum>
                <moneyline>
                    <moneyline_visiting>-148</moneyline_visiting>
                    <moneyline_home>136</moneyline_home>
                </moneyline>
            </period>
            <period>
                <period_number>0</period_number>
                <period_description>Game</period_description>
                <periodcutoff_datetimeGMT>2010-07-22 21:30</periodcutoff_datetimeGMT>
                <period_status>O</period_status>
                <period_update>open</period_update>
                <spread_maximum>500</spread_maximum>
                <moneyline_maximum>2000</moneyline_maximum>
                <total_maximum>500</total_maximum>
                <spread>
                    <spread_visiting>-2</spread_visiting>
                    <spread_adjust_visiting>-121</spread_adjust_visiting>
                    <spread_home>2</spread_home>
                    <spread_adjust_home>104</spread_adjust_home>
                </spread>
                <total>
                    <total_points>22.5</total_points>
                    <over_adjust>-111</over_adjust>
                    <under_adjust>-105</under_adjust>
                </total>
            </period>
            <period>
                <period_number>1</period_number>
                <period_description>1st Set</period_description>
                <periodcutoff_datetimeGMT>2010-07-22 21:30</periodcutoff_datetimeGMT>
                <period_status>O</period_status>
                <period_update>open</period_update>
                <spread_maximum>5000</spread_maximum>
                <moneyline_maximum>500</moneyline_maximum>
                <total_maximum>5000</total_maximum>
                <moneyline>
                    <moneyline_visiting>-126</moneyline_visiting>
                    <moneyline_home>108</moneyline_home>
                </moneyline>
            </period>
        </periods>
    </event>
    
    ...
    ...
    
    </events>
    </_line_feed>
    

    使用的XSD架构文件:

    <?xml version="1.0" encoding="utf-8"?>
    <xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
      <xs:element name="_line_feed">
        <xs:complexType>
          <xs:sequence>
            <xs:element name="FeedTime" type="xs:unsignedLong" />
            <xs:element name="lastContest" type="xs:unsignedInt" />
            <xs:element name="lastGame" type="xs:unsignedInt" />
            <xs:element name="events">
              <xs:complexType>
                <xs:sequence>
                  <xs:element maxOccurs="unbounded" name="event">
                    <xs:complexType>
                      <xs:sequence>
                        <xs:element name="event_datetimeGMT" type="xs:string" />
                        <xs:element name="gamenumber" type="xs:unsignedInt" />
                        <xs:element name="sporttype" type="xs:string" />
                        <xs:element name="league" type="xs:string" />
                        <xs:element minOccurs="0" name="contest_maximum" type="xs:unsignedShort" />
                        <xs:element minOccurs="0" name="description" type="xs:string" />
                        <xs:element name="participants">
                          <xs:complexType>
                            <xs:sequence>
                              <xs:element maxOccurs="unbounded" name="participant">
                                <xs:complexType>
                                  <xs:sequence>
                                    <xs:element name="participant_name" type="xs:string" />
                                    <xs:element name="contestantnum" type="xs:unsignedInt" />
                                    <xs:element name="rotnum" type="xs:unsignedShort" />
                                    <xs:element minOccurs="0" name="odds">
                                      <xs:complexType>
                                        <xs:sequence>
                                          <xs:element name="moneyline_value" type="xs:string" />
                                          <xs:element name="to_base" type="xs:string" />
                                        </xs:sequence>
                                      </xs:complexType>
                                    </xs:element>
                                    <xs:element minOccurs="0" name="visiting_home_draw" type="xs:string" />
                                    <xs:element minOccurs="0" name="pitcher" type="xs:string" />
                                  </xs:sequence>
                                </xs:complexType>
                              </xs:element>
                            </xs:sequence>
                          </xs:complexType>
                        </xs:element>
                        <xs:element minOccurs="0" name="total">
                          <xs:complexType>
                            <xs:sequence>
                              <xs:element name="total_points" type="xs:decimal" />
                              <xs:element name="units" type="xs:string" />
                            </xs:sequence>
                          </xs:complexType>
                        </xs:element>
                        <xs:element minOccurs="0" name="periods">
                          <xs:complexType>
                            <xs:sequence minOccurs="0">
                              <xs:element maxOccurs="unbounded" name="period">
                                <xs:complexType>
                                  <xs:sequence>
                                    <xs:element name="period_number" type="xs:unsignedByte" />
                                    <xs:element name="period_description" type="xs:string" />
                                    <xs:element name="periodcutoff_datetimeGMT" type="xs:string" />
                                    <xs:element name="period_status" type="xs:string" />
                                    <xs:element name="period_update" type="xs:string" />
                                    <xs:element name="spread_maximum" type="xs:unsignedShort" />
                                    <xs:element name="moneyline_maximum" type="xs:unsignedShort" />
                                    <xs:element name="total_maximum" type="xs:unsignedShort" />
                                    <xs:element minOccurs="0" name="moneyline">
                                      <xs:complexType>
                                        <xs:sequence>
                                          <xs:element name="moneyline_visiting" type="xs:short" />
                                          <xs:element name="moneyline_home" type="xs:short" />
                                          <xs:element minOccurs="0" name="moneyline_draw" type="xs:unsignedShort" />
                                        </xs:sequence>
                                      </xs:complexType>
                                    </xs:element>
                                    <xs:element minOccurs="0" name="spread">
                                      <xs:complexType>
                                        <xs:sequence>
                                          <xs:element name="spread_visiting" type="xs:decimal" />
                                          <xs:element name="spread_adjust_visiting" type="xs:short" />
                                          <xs:element name="spread_home" type="xs:decimal" />
                                          <xs:element name="spread_adjust_home" type="xs:short" />
                                        </xs:sequence>
                                      </xs:complexType>
                                    </xs:element>
                                    <xs:element minOccurs="0" name="total">
                                      <xs:complexType>
                                        <xs:sequence>
                                          <xs:element name="total_points" type="xs:decimal" />
                                          <xs:element name="over_adjust" type="xs:short" />
                                          <xs:element name="under_adjust" type="xs:short" />
                                        </xs:sequence>
                                      </xs:complexType>
                                    </xs:element>
                                  </xs:sequence>
                                </xs:complexType>
                              </xs:element>
                            </xs:sequence>
                          </xs:complexType>
                        </xs:element>
                      </xs:sequence>
                    </xs:complexType>
                  </xs:element>
                </xs:sequence>
              </xs:complexType>
            </xs:element>
          </xs:sequence>
        </xs:complexType>
      </xs:element>
    </xs:schema>
    

    等待你的回应。如果需要其他一些细节来了解情景,请告诉我。

    谢谢, PRASHANT

2 个答案:

答案 0 :(得分:0)

您是否尝试将原始XML加载到数据库表中的XML数据类型中,然后converting the data into database tables via stored procedures?我通常更喜欢这种方法,以便我可以保留原始的XML代码,以防在导入时我不知道的架构发生变化,例如源系统向文件中添加新节点。如果您采用此方法,则可以使用sp_xml_preparedocumentOPENXML转换数据。请注意,OPENXML支持DTD来推断输出数据类型。

答案 1 :(得分:0)

在SSRS中,如果内存不足,可能会发生此错误。不知道这也适用于SSIS。 (但你明确地提到了大文件,所以也许它是相关的。你可以尝试使用较小的文件)。

This link表示您必须自行解决。也许对XML文件进行一些预处理?