我计划实现一种方法来比较两个大的XML文件(但每个文件的元素行少于10,000个。)
以下方法有效,但当文件超过100行时效果不佳。它开始很慢。我怎样才能找到更有效的解决方案。可能需要高C#编程设计或更好的C#& XML处理。
提前感谢您的意见。
//Remove the item which not in Event Xml and ConfAddition Xml files
XmlDocument doc = new XmlDocument();
doc.Load(xmlFile_AlarmSettingUp);
bool isNewAlid_Event = false;
bool isNewAlid_ConfAddition = false;
int alid = 0;
XmlNodeList xnList = doc.SelectNodes("/Equipment/AlarmSettingUp/EnabledALIDs/ALID");
foreach (XmlNode xn in xnList)
{
XmlAttributeCollection attCol = xn.Attributes;
for (int i = 0; i < attCol.Count; ++i)
{
if (attCol[i].Name == "alid")
{
alid = int.Parse(attCol[i].Value.ToString());
break;
}
}
//alid = int.Parse(attCol[1].Value.ToString());
XmlDocument docEvent_Alarm = new XmlDocument();
docEvent_Alarm.Load(xmlFile_Event);
XmlNodeList xnListEvent_Alarm = docEvent_Alarm.SelectNodes("/Equipment/Alarms/ALID");
foreach (XmlNode xnEvent_Alarm in xnListEvent_Alarm)
{
XmlAttributeCollection attColEvent_Alarm = xnEvent_Alarm.Attributes;
int alidEvent_Alarm = int.Parse(attColEvent_Alarm[1].Value.ToString());
if (alid == alidEvent_Alarm)
{
isNewAlid_Event = false;
break;
}
else
{
isNewAlid_Event = true;
//break;
}
}
XmlDocument docConfAddition_Alarm = new XmlDocument();
docConfAddition_Alarm.Load(xmlFile_ConfAddition);
XmlNodeList xnListConfAddition_Alarm = docConfAddition_Alarm.SelectNodes("/Equipment/Alarms/ALID");
foreach (XmlNode xnConfAddition_Alarm in xnListConfAddition_Alarm)
{
XmlAttributeCollection attColConfAddition_Alarm = xnConfAddition_Alarm.Attributes;
int alidConfAddition_Alarm = int.Parse(attColConfAddition_Alarm[1].Value.ToString());
if (alid == alidConfAddition_Alarm)
{
isNewAlid_ConfAddition = false;
break;
}
else
{
isNewAlid_ConfAddition = true;
//break;
}
}
if ( isNewAlid_Event && isNewAlid_ConfAddition )
{
// Store the root node of the destination document into an XmlNode
XmlNode rootDest = doc.SelectSingleNode("/Equipment/AlarmSettingUp/EnabledALIDs");
rootDest.RemoveChild(xn);
}
}
doc.Save(xmlFile_AlarmSettingUp);
我的XML文件就是这样。这两个XML文件是相同的样式。除了一些时间,其中一个可能会被我的应用程序修改。这就是我需要比较它们的原因。
<?xml version="1.0" encoding="utf-8"?>
<Equipment xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Licence LicenseId="" LicensePath="" />
<!--Alarm Setting Up XML File-->
<AlarmSettingUp>
<EnabledALIDs>
<ALID logicalName="Misc_EV_RM_STATION_ALREADY_RESERVED" alid="536870915" alcd="7" altx="Misc_Station 1 UnitName 2 SlotId already reserved" ceon="Misc_AlarmOn_EV_RM_STATION_ALREADY_RESERVED" ceoff="Misc_AlarmOff_EV_RM_STATION_ALREADY_RESERVED" />
<ALID logicalName="Misc_EV_RM_SEQ_READ_ERROR" alid="536870916" alcd="7" altx="Misc_Sequence ID 1 d step 2 d read error for wafer in 3 UnitName 4 SlotId" ceon="Misc_AlarmOn_EV_RM_SEQ_READ_ERROR" ceoff="Misc_AlarmOff_EV_RM_SEQ_READ_ERROR" />
...
...
...
</EnabledALIDs>
</AlarmSettingUp>
</Equipment>
答案 0 :(得分:1)
“ALID / @ alid”似乎是你的关键,所以我要做的第一件事(在foreach (XmlNode xn in xnList)
之前)是在docEvent_Alarm.SelectNodes("/Equipment/Alarms/ALID")
@alid上构建一个字典(假设这是唯一的)值 - 那么你可以在没有O(n * m)性能的情况下完成大部分工作 - 它将更多O(n + m)(这是一个很大的区别)。
var lookup = new Dictionary<string, XmlElement>();
foreach(XmlElement el in docEvent_Alarm.SelectNodes("/Equipment/Alarms/ALID")) {
lookup.Add(el.GetAttribute("alid"), el);
}
然后你可以使用:
XmlElement other;
if(lookup.TryGetValue(otherKey, out other)) {
// exists; element now in "other"
} else {
// doesn't exist
}
答案 1 :(得分:1)
XmlDocument和相关类(XmlNode,...)在xml处理中并不是很快。请尝试使用XmlTextReader。
你也可以在父母循环的每次迭代中调用docEvent_Alarm.Load(xmlFile_Event);
和docConfAddition_Alarm.Load(xmlFile_ConfAddition);
- 这不好。如果您的xmlFile_Event
和xmlFile_ConfAddition
在所有处理过程中都是持久的 - 最好在主循环之前初始化它。
答案 2 :(得分:1)
您是否尝试过使用Microsoft的XmlDiff类?见http://msdn.microsoft.com/en-us/library/aa302294.aspx