目标
我正在尝试将数据从具有嵌套元素的多层XML文件迁移到单个表。
系统参数
XML文件
Here's the XSD for the XML file I have
正如您所看到的,它不仅仅是一个简单的布局。整个事情都包含在“人物”标签中,大约有1000个人。每个'Person'标签都包含以下信息元素。 XML是这样的:
人
作为旁注,可以有多个
问题
现在,这是我的问题。如何将所有这些信息放入带有SSIS的单个 SQL表中?我知道XML文件的拓扑不直接映射到表的拓扑,但我想强制它。我想为每个'人'分别设一行。我还想要足够的列来捕获我的数据集中任何一个人拥有的最大数量的“书”。也许这意味着在决赛桌中创建'Book_1','Book_2','Book_3'等列。我不想要一系列带有外键和主键的表。我想为每个'Book','Year''Details'分别对应每个元素。更清楚的是,让我告诉你我想要一个例子。
示例XML文件
如果我有一个带有3个'Book'元素的'Books'标签,我想为每本书创建一个单独的列:
SQL数据库中的结果表示例
我希望该表看起来像this,对于XML文件的所有嵌套元素看起来都是这样的。是否可以使用SSIS以这种方式对数据库进行一种扁平导入?
谢谢!我真的很感激。
附加说明
实际XML文件的片段
以下是XML文件的示例。实际的XML有很多<Person>
个。
<?xml version="1.0" encoding="UTF-8" ?>
<People>
<Person>
<FirstName>Eliza</FirstName>
<LastName>Ablovatski</LastName>
<Biography>
<![CDATA[<p>Eliza Ablovatski joined the Kenyon history department in 2003, after graduate work in East Central European history at Columbia University and research and fellowships in Munich and Berlin, Germany and Budapest, Hungary. She teaches classes on Europe from 1500 to the present, focusing on the nineteenth and twentieth centuries, Germany, Russia, the Habsburg Monarchy, film, nationalism and identity, gender, race, and the interwar period.</p>
<p>Her dissertation and first book, <em>Revolution and Political Violence in Central Europe: The Deluge of 1919</em> (forthcoming from Cambridge University Press), focus on the revolutionary upheavals in Munich and Budapest following the First World War, and their relationship to political violence and antisemitism. She is currently researching the occupation of Austria (1945-1955) at the end of the Second World War, and the nuclear idea in postwar Europe. She has also researched and written extensively on the history of Jews in the former Habsburg regional capital of Czernowitz (now Ukraine).</p>]]>
</Biography>
<Expertise>
<![CDATA[<p>Modern Europe, especially Germany and Central/East Central Europe in the nineteenth and twentieth centuries; European Jewish and women's history, East European and German film and literature, socialism, war, and revolution.</p>]]>
</Expertise>
<Image>http://www.kenyon.edu/images/directory/ablovatski.jpg</Image>
<Link>http://www.kenyon.edu/directories/campus-directory/biography/eliza-ablovatski/</Link>
<Books>
<Book>
<Year></Year>
<Details>
<![CDATA[<p><em>Zwischen Pruth und Jordan. Lebenserinnerungen Czernowitzer Juden</em><em> , </em>with Gaby Coldewey and others Köln: Böhlau Verlag, 2003</p>]]>
</Details>
</Book>
<Book>
<Year></Year>
<Details>
<![CDATA[<p><em>Czernowitz ist gewen an alt jiddische Stdt: Überlebende berichten,</em> With Gaby Coldewey and others. First Edition: Czernowitz,Ukraine: distributed by the Heinrich-Böll-Stiftung, 1998 Second Edition: Berlin, 1999 (Third edition: Potsdam, forthcoming 2009)</p>]]>
</Details>
</Book>
</Books>
<Articles>
<Article>
<Year></Year>
<Details>
<![CDATA[<p>"The Central European Revolutions of 1919 and the Myth of Judeo-Bolshevism," <em>European Review of History, Vol. 17/ Issue 3: Cosmopolitanism, Nationalism and the Jews of East Central Europe (2010), 473-489.</em></p>]]>
</Details>
</Article>
<Article>
<Year></Year>
<Details>
<![CDATA[<p>"Between Red Army and White Guard: Women in Budapest, 1918-1919," in <em>Gender and War in Twentieth-Century Eastern Europe,</em> edited by Maria Bucur and Nancy Wingfield Bloomington: Indiana University Press 2006</p>]]>
</Details>
</Article>
<Article>
<Year></Year>
<Details>
<![CDATA[<p>"The Girl with the Titus-head: Women in Revolution in Munich and Budapest, 1919" <em>Nationalities Papers </em>28/3 (September 2000), 541-550</p>]]>
</Details>
</Article>
</Articles>
<Papers>
</Papers>
<Artwork>
</Artwork>
<Websites>
</Websites>
</Person>
...This goes on to include many <Person> elements. (About 1000)
</People>
答案 0 :(得分:1)
实际XML的Thx!以下查询将从XML中获取您的值。它将为它们生成ID以将所有数据存储在相关表中。
注意:我不得不将'
中的woman's
符号加倍,我添加了第二个person
以显示该方法:
DECLARE @x XML=
'<?xml version="1.0" encoding="UTF-8" ?>
<People>
<Person>
<FirstName>Eliza</FirstName>
<LastName>Ablovatski</LastName>
<Biography>
<![CDATA[<p>Eliza Ablovatski joined the Kenyon history department in 2003, after graduate work in East Central European history at Columbia University and research and fellowships in Munich and Berlin, Germany and Budapest, Hungary. She teaches classes on Europe from 1500 to the present, focusing on the nineteenth and twentieth centuries, Germany, Russia, the Habsburg Monarchy, film, nationalism and identity, gender, race, and the interwar period.</p>
<p>Her dissertation and first book, <em>Revolution and Political Violence in Central Europe: The Deluge of 1919</em> (forthcoming from Cambridge University Press), focus on the revolutionary upheavals in Munich and Budapest following the First World War, and their relationship to political violence and antisemitism. She is currently researching the occupation of Austria (1945-1955) at the end of the Second World War, and the nuclear idea in postwar Europe. She has also researched and written extensively on the history of Jews in the former Habsburg regional capital of Czernowitz (now Ukraine).</p>]]>
</Biography>
<Expertise>
<![CDATA[<p>Modern Europe, especially Germany and Central/East Central Europe in the nineteenth and twentieth centuries; European Jewish and women''s history, East European and German film and literature, socialism, war, and revolution.</p>]]>
</Expertise>
<Image>http://www.kenyon.edu/images/directory/ablovatski.jpg</Image>
<Link>http://www.kenyon.edu/directories/campus-directory/biography/eliza-ablovatski/</Link>
<Books>
<Book>
<Year></Year>
<Details>
<![CDATA[<p><em>Zwischen Pruth und Jordan. Lebenserinnerungen Czernowitzer Juden</em><em> , </em>with Gaby Coldewey and others Köln: Böhlau Verlag, 2003</p>]]>
</Details>
</Book>
<Book>
<Year></Year>
<Details>
<![CDATA[<p><em>Czernowitz ist gewen an alt jiddische Stdt: Überlebende berichten,</em> With Gaby Coldewey and others. First Edition: Czernowitz,Ukraine: distributed by the Heinrich-Böll-Stiftung, 1998 Second Edition: Berlin, 1999 (Third edition: Potsdam, forthcoming 2009)</p>]]>
</Details>
</Book>
</Books>
<Articles>
<Article>
<Year></Year>
<Details>
<![CDATA[<p>"The Central European Revolutions of 1919 and the Myth of Judeo-Bolshevism," <em>European Review of History, Vol. 17/ Issue 3: Cosmopolitanism, Nationalism and the Jews of East Central Europe (2010), 473-489.</em></p>]]>
</Details>
</Article>
<Article>
<Year></Year>
<Details>
<![CDATA[<p>"Between Red Army and White Guard: Women in Budapest, 1918-1919," in <em>Gender and War in Twentieth-Century Eastern Europe,</em> edited by Maria Bucur and Nancy Wingfield Bloomington: Indiana University Press 2006</p>]]>
</Details>
</Article>
<Article>
<Year></Year>
<Details>
<![CDATA[<p>"The Girl with the Titus-head: Women in Revolution in Munich and Budapest, 1919" <em>Nationalities Papers </em>28/3 (September 2000), 541-550</p>]]>
</Details>
</Article>
</Articles>
<Papers>
</Papers>
<Artwork>
</Artwork>
<Websites>
</Websites>
</Person>
<Person>
<FirstName>One</FirstName>
<LastName>More</LastName>
<Biography>Biography: Some interesting facts...</Biography>
<Expertise>Expertise: Some interesting facts...</Expertise>
<Image>somepicture.jpg</Image>
<Link>somelink.com</Link>
<Books>
<Book>
<Year>2001</Year>
<Details>Book1</Details>
</Book>
<Book>
<Year>2002</Year>
<Details>Book2</Details>
</Book>
</Books>
<Articles>
<Article>
<Year>2001</Year>
<Details>Article1</Details>
</Article>
</Articles>
<Papers>
</Papers>
<Artwork>
</Artwork>
<Websites>
</Websites>
</Person>
</People>';
With MyPersonCTE AS
(
SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS PersonID
,p.value('FirstName[1]','varchar(max)') AS FirstName
,p.value('LastName[1]','varchar(max)') AS LastName
,p.value('Biography[1]','varchar(max)') AS Biography
,p.value('Expertise[1]','varchar(max)') AS Expertise
,p.value('Image[1]','varchar(max)') AS Image
,p.value('Link[1]','varchar(max)') AS Link
,p.query('Books') AS BookNode
,p.query('Articles') AS ArticleNode
--same for Papers, Artwork...
FROM @x.nodes('/People/Person') AS A(p)
)
,MyBooksCTE AS
(
SELECT MyPersonCTE.*
,ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS BookID
,x.value('Year[1]','int') AS BookYear
,x.value('Details[1]','varchar(max)') AS BookDetails
FROM MyPersonCTE
CROSS APPLY MyPersonCTE.BookNode.nodes('/Books/Book') A(x)
)
,MyArticlesCTE AS
(
SELECT MyPersonCTE.*
,ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS ArticleID
,x.value('Year[1]','int') AS ArticleYear
,x.value('Details[1]','varchar(max)') AS ArticleDetails
FROM MyPersonCTE
CROSS APPLY MyPersonCTE.ArticleNode.nodes('/Articles/Article') A(x)
)
--same for Papers, Artwork...
SELECT p.*
,b.BookID
,b.BookYear
,b.BookDetails
,a.ArticleID
,a.ArticleYear
,a.ArticleDetails
INTO #tempAllData
FROM MyPersonCTE AS p
LEFT JOIN MyBooksCTE AS b ON p.PersonID=b.PersonID
LEFT JOIN MyArticlesCTE AS a ON p.PersonID=a.PersonID ;
--#tempAllData is now filled with all data, copied in all combination: much to much
--but DISTINCT is your friend
--in this case you'd use the PersonID as FK in all related tables
SELECT DISTINCT PersonID,FirstName,LastName,Biography,Expertise --other fields
FROM #tempAllData;
SELECT DISTINCT PersonID,BookID,BookYear,BookDetails
FROM #tempAllData;
SELECT DISTINCT PersonID,ArticleID,ArticleYear,ArticleDetails
FROM #tempAllData;
DROP TABLE #tempAllData;
人:
1 Eliza Ablovatski <p>Eliza Ablovatski joined ...
2 One More Biography: Some interesting facts...
图书
1 1 0 <p><em>Zwischen Pruth und ...
1 2 0 <p><em>Czernowitz ist gewen ...
2 3 2001 Book1
2 4 2002 Book2
文章
1 1 0 <p>"The Central European ...
1 2 0 <p>"Between Red Army and White ...
1 3 0 <p>"The Girl with the Titus-head: ...
2 4 2001 Article1
这仅适用于动态SQL。从上面开始,将查询更改为以下内容。它会首先自动找到列名,然后使用UNION ALL
强制所有数据进入同一个结构,最后有一个很大的动态PIVOT
:
注意:我在与CTE相关的ROW_NUMBERs中添加了PARTITION BY PersonID
。这是为每个人获取以1
开头的ID
With MyPersonCTE AS
(
SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS PersonID
,p.value('FirstName[1]','varchar(max)') AS FirstName
,p.value('LastName[1]','varchar(max)') AS LastName
,p.value('Biography[1]','varchar(max)') AS Biography
,p.value('Expertise[1]','varchar(max)') AS Expertise
,p.value('Image[1]','varchar(max)') AS Image
,p.value('Link[1]','varchar(max)') AS Link
,p.query('Books') AS BookNode
,p.query('Articles') AS ArticleNode
--same for Papers, Artwork...
FROM @x.nodes('/People/Person') AS A(p)
)
,MyBooksCTE AS
(
SELECT MyPersonCTE.*
,ROW_NUMBER() OVER(PARTITION BY PersonID ORDER BY (SELECT NULL)) AS BookID
,x.value('Year[1]','int') AS BookYear
,x.value('Details[1]','varchar(max)') AS BookDetails
FROM MyPersonCTE
CROSS APPLY MyPersonCTE.BookNode.nodes('/Books/Book') A(x)
)
,MyArticlesCTE AS
(
SELECT MyPersonCTE.*
,ROW_NUMBER() OVER(PARTITION BY PersonID ORDER BY (SELECT NULL)) AS ArticleID
,x.value('Year[1]','int') AS ArticleYear
,x.value('Details[1]','varchar(max)') AS ArticleDetails
FROM MyPersonCTE
CROSS APPLY MyPersonCTE.ArticleNode.nodes('/Articles/Article') A(x)
)
--same for Papers, Artwork...
SELECT p.*
,b.BookID
,b.BookYear
,b.BookDetails
,a.ArticleID
,a.ArticleYear
,a.ArticleDetails
INTO #tempAllData
FROM MyPersonCTE AS p
LEFT JOIN MyBooksCTE AS b ON p.PersonID=b.PersonID
LEFT JOIN MyArticlesCTE AS a ON p.PersonID=a.PersonID ;
--#tempAllData is now filled with all data, copied in all combination: much to much
--but DISTINCT is your friend
--in this case you'd use the PersonID as FK in all related tables
SELECT DISTINCT PersonID,FirstName,LastName,Biography,Expertise --other fields
INTO #tempPerson
FROM #tempAllData;
SELECT DISTINCT PersonID,BookID,BookYear,BookDetails
INTO #tempBooks
FROM #tempAllData;
SELECT DISTINCT PersonID,ArticleID,ArticleYear,ArticleDetails
INTO #tempArticles
FROM #tempAllData;
DECLARE @columnNames VARCHAR(MAX)=
STUFF((SELECT DISTINCT ',Book_'+CAST(BookID AS VARCHAR(10)) FROM #tempBooks FOR XML PATH('')),1,1,'')
+(SELECT DISTINCT ',Article_'+CAST(ArticleID AS VARCHAR(10)) FROM #tempArticles FOR XML PATH(''));
DECLARE @cmd VARCHAR(MAX)=
'SELECT p.*
FROM
(
SELECT p.*
,''Book_''+CAST(BookID AS VARCHAR(10)) AS ColumnName
,ISNULL(CAST(BookYear AS VARCHAR(4)),'''') + '' '' + BookDetails AS Data
FROM #tempPerson AS p
INNER JOIN #tempBooks AS b ON p.PersonID=b.PersonID
UNION ALL
SELECT p.*
,''Article_''+CAST(ArticleID AS VARCHAR(10)) AS ColumnName
,ISNULL(CAST(ArticleYear AS VARCHAR(4)),'''') + '' '' + ArticleDetails AS Data
FROM #tempPerson AS p
INNER JOIN #tempArticles AS a ON p.PersonID=a.PersonID
) AS tbl
PIVOT
(
MAX(Data) FOR ColumnName IN(' + @columnNames + ')
) AS p;'
EXEC(@cmd);
DROP TABLE #tempArticles
DROP TABLE #tempBooks
DROP TABLE #tempPerson
DROP TABLE #tempAllData;