我有两个xml文件都有相同的架构,我想合并到一个xml文件中。是否有捷径可寻?
例如,
<Root>
<LeafA>
<Item1 />
<Item2 />
</LeafA>
<LeafB>
<Item1 />
<Item2 />
</LeafB>
</Root>
+
<Root>
<LeafA>
<Item3 />
<Item4 />
</LeafA>
<LeafB>
<Item3 />
<Item4 />
</LeafB>
</Root>
=包含
的新文件<Root>
<LeafA>
<Item1 />
<Item2 />
<Item3 />
<Item4 />
</LeafA>
<LeafB>
<Item1 />
<Item2 />
<Item3 />
<Item4 />
</LeafB>
</Root>
答案 0 :(得分:11)
“自动XML合并”听起来像是一个相对简单的要求,但是当你深入了解所有细节时,它会变得非常复杂。与c#或XSLT合并对于更具体的任务将更容易,例如在EF模型的answer中。使用工具来协助手动合并也是一种选择(参见this SO question)。
供参考(以及对复杂性有所了解)这里是Java世界的一个开源示例:XML merging made easy
回到原来的问题。任务规范中几乎没有大的灰色区域:当2个元素应被视为等效(具有相同名称,匹配选定或所有属性,或者在父元素中也具有相同位置);当原始或合并的XML具有多个等效元素等时,如何处理这种情况。
以下代码假设
// determine which elements we consider the same
//
private static bool AreEquivalent(XElement a, XElement b)
{
if(a.Name != b.Name) return false;
if(!a.HasAttributes && !b.HasAttributes) return true;
if(!a.HasAttributes || !b.HasAttributes) return false;
if(a.Attributes().Count() != b.Attributes().Count()) return false;
return a.Attributes().All(attA => b.Attributes(attA.Name)
.Count(attB => attB.Value == attA.Value) != 0);
}
// Merge "merged" document B into "source" A
//
private static void MergeElements(XElement parentA, XElement parentB)
{
// merge per-element content from parentB into parentA
//
foreach (XElement childB in parentB.DescendantNodes())
{
// merge childB with first equivalent childA
// equivalent childB1, childB2,.. will be combined
//
bool isMatchFound = false;
foreach (XElement childA in parentA.Descendants())
{
if (AreEquivalent(childA, childB))
{
MergeElements(childA, childB);
isMatchFound = true;
break;
}
}
// if there is no equivalent childA, add childB into parentA
//
if (!isMatchFound) parentA.Add(childB);
}
}
它将使用原始XML片段产生所需的结果,但如果输入XML更复杂并且具有重复元素,则结果将更加......有趣:
public static void Test()
{
var a = XDocument.Parse(@"
<Root>
<LeafA>
<Item1 />
<Item2 />
<SubLeaf><X/></SubLeaf>
</LeafA>
<LeafB>
<Item1 />
<Item2 />
</LeafB>
</Root>");
var b = XDocument.Parse(@"
<Root>
<LeafB>
<Item5 />
<Item1 />
<Item6 />
</LeafB>
<LeafA Name=""X"">
<Item3 />
</LeafA>
<LeafA>
<Item3 />
</LeafA>
<LeafA>
<SubLeaf><Y/></SubLeaf>
</LeafA>
</Root>");
MergeElements(a.Root, b.Root);
Console.WriteLine("Merged document:\n{0}", a.Root);
}
这是合并文档,显示了文档B中的等效元素是如何组合在一起的:
<Root>
<LeafA>
<Item1 />
<Item2 />
<SubLeaf>
<X />
<Y />
</SubLeaf>
<Item3 />
</LeafA>
<LeafB>
<Item1 />
<Item2 />
<Item5 />
<Item6 />
</LeafB>
<LeafA Name="X">
<Item3 />
</LeafA>
</Root>
答案 1 :(得分:1)
如果格式总是如此,则此方法没有任何问题:
从第一个文件中删除最后两行,并在删除前两行时附加第二个文件。
查看Linux命令head
和tail
,它们可以删除第一行和最后两行。
答案 2 :(得分:1)
这是一个简单的XSLT转换(适用于文档a.xml):
<xsl:variable name="docB" select="document('b.xml')"/>
<xsl:template match="Root">
<Root><xsl:apply-templates/></Root>
</xsl:template>
<xsl:template match="Root/LeafA">
<xsl:copy-of select="*"/>
<xsl:copy-of select="$docB/Root/LeafA/*"/>
</xsl:template>
<xsl:template match="Root/LeafB">
<xsl:copy-of select="*"/>
<xsl:copy-of select="$docB/Root/LeafB/*"/>
</xsl:template>
答案 3 :(得分:0)
vimdiff file_a file_b
只是一个例子
当我在Windows http://www.scootersoftware.com/
时,BeyondCompare是我的最爱答案 4 :(得分:0)
我最终使用C#并创建了一个脚本。当我提出这个问题时,我知道我可以做到这一点,但我想知道是否有更快的方法可以做到这一点,因为我从未真正使用过XML。
该剧本遵循以下方针:
var a = new XmlDocument();
a.Load(PathToFile1);
var b = new XmlDocument();
b.Load(PathToFile2);
MergeNodes(
a.SelectSingleNode(nodePath),
b.SelectSingleNode(nodePath).ChildNodes,
a);
a.Save(PathToFile1);
MergeNodes()
看起来像这样:
private void MergeNodes(XmlNode parentNodeA, XmlNodeList childNodesB, XmlDocument parentA)
{
foreach (XmlNode oNode in childNodesB)
{
// Exclude container node
if (oNode.Name == "#comment") continue;
bool isFound = false;
string name = oNode.Attributes["Name"].Value;
foreach (XmlNode child in parentNodeA.ChildNodes)
{
if (child.Name == "#comment") continue;
// If node already exists and is unchanged, exit loop
if (child.OuterXml== oNode.OuterXml&& child.InnerXml == oNode.InnerXml)
{
isFound = true;
Console.WriteLine("Found::NoChanges::" + oNode.Name + "::" + name);
break;
}
// If node already exists but has been changed, replace it
if (child.Attributes["Name"].Value == name)
{
isFound = true;
Console.WriteLine("Found::Replaced::" + oNode.Name + "::" + name);
parentNodeA.ReplaceChild(parentA.ImportNode(oNode, true), child);
}
}
// If node does not exist, add it
if (!isFound)
{
Console.WriteLine("NotFound::Adding::" + oNode.Name + "::" + name);
parentNodeA.AppendChild(parentA.ImportNode(oNode, true));
}
}
}
它不完美 - 我必须手动指定我想要合并的节点,但是我可以快速轻松地将它放在一起,因为我几乎不了解XML,我很高兴:)
它实际上效果更好,它只合并指定的节点,因为我用它来合并Entity Framework的edmx文件,我真的只想合并SSDL,CDSL和MSL节点。
答案 5 :(得分:0)
您可以这样做,使用xml加载数据集并合并数据集。
Dim dsFirst As New DataSet()
Dim dsMerge As New DataSet()
' Create new FileStream with which to read the schema.
Dim fsReadXmlFirst As New System.IO.FileStream(myXMLfileFirst, System.IO.FileMode.Open)
Dim fsReadXmlMerge As New System.IO.FileStream(myXMLfileMerge, System.IO.FileMode.Open)
Try
dsFirst.ReadXml(fsReadXmlFirst)
dsMerge.ReadXml(fsReadXmlMerge)
Dim str As String = "Merge Table(0) Row Count = " & dsMerge.Tables(0).Rows.Count
str = str & Chr(13) & "Merge Table(1) Row Count = " & dsMerge.Tables(1).Rows.Count
str = str & Chr(13) & "Merge Table(2) Row Count = " & dsMerge.Tables(2).Rows.Count
MsgBox(str)
dsMerge.Merge(dsFirst, True)
DataGridParent.DataSource = dsMerge
DataGridParent.DataMember = "rulefile"
DataGridChild.DataSource = dsMerge
DataGridChild.DataMember = "rule"
str = ""
str = "Merge Table(0) Row Count = " & dsMerge.Tables(0).Rows.Count
str = str & Chr(13) & "Merge Table(1) Row Count = " & dsMerge.Tables(1).Rows.Count
str = str & Chr(13) & "Merge Table(2) Row Count = " & dsMerge.Tables(2).Rows.Count
MsgBox(str)
答案 6 :(得分:0)
转发来自https://www.perlmonks.org/?node_id=127848的回答
将以下内容粘贴到 perl 脚本中
use strict;
require 5.000;
use Data::Dumper;
use XML::Simple;
use Hash::Merge;
my $xmlFile1 = shift || die "XmlFile1\n";
my $xmlFile2 = shift || die "XmlFile2\n";
my %config1 = %{XMLin ($xmlFile1)};
my %config2 = %{XMLin ($xmlFile2)};
my $merger = Hash::Merge->new ('RIGHT_PRECEDENT');
my %newhash = %{ $merger->merge (\%config1, \%config2) };
# XMLout (\%newhash, outputfile => "newfile", xmldecl => 1, rootname => 'config');
print XMLout (\%newhash);