我希望能够在C#/ .NET 4.0应用程序中显示实体名称和值的列表。
我可以使用XmlDocument.DocumentType.Entities
轻松检索实体名称,但有没有一种方法可以检索这些实体的值?
我注意到我可以使用InnerText
检索纯文本实体的值,但这对包含XML标记的实体不起作用。
采用正则表达式的最佳方式是什么?
假设我有一个这样的文件:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE document [
<!ENTITY test "<para>only a test</para>">
<!ENTITY wwwc "World Wide Web Corporation">
<!ENTITY copy "©">
]>
<document>
<!-- The following image is the World Wide Web Corporation logo. -->
<graphics image="logo" alternative="&wwwc; Logo"/>
</document>
我想向用户提供一个列表,其中包含三个实体名称(test,wwwc和copy)及其值(名称后引号中的文本)。我没有想过嵌套在其他实体中的实体的问题,所以我会对完全扩展实体值或显示文本的解决方案感兴趣。
答案 0 :(得分:2)
虽然这不太可能是最优雅的解决方案,但我想出了一些似乎适用于我的目的的东西。首先,我解析原始文档并从该文档中检索实体节点。然后我创建了一个小的内存中XML文档,我添加了所有实体节点。接下来,我向临时XML中的所有实体添加了实体引用。最后,我从所有引用中检索了InnerXml。
以下是一些示例代码:
// parse the original document and retrieve its entities
XmlDocument parsedXmlDocument = new XmlDocument();
XmlUrlResolver resolver = new XmlUrlResolver();
resolver.Credentials = CredentialCache.DefaultCredentials;
parsedXmlDocument.XmlResolver = resolver;
parsedXmlDocument.Load(path);
// create a temporary xml document with all the entities and add references to them
// the references can then be used to retrieve the value for each entity
XmlDocument entitiesXmlDocument = new XmlDocument();
XmlDeclaration dec = entitiesXmlDocument.CreateXmlDeclaration("1.0", null, null);
entitiesXmlDocument.AppendChild(dec);
XmlDocumentType newDocType = entitiesXmlDocument.CreateDocumentType(parsedXmlDocument.DocumentType.Name, parsedXmlDocument.DocumentType.PublicId, parsedXmlDocument.DocumentType.SystemId, parsedXmlDocument.DocumentType.InternalSubset);
entitiesXmlDocument.AppendChild(newDocType);
XmlElement root = entitiesXmlDocument.CreateElement("xmlEntitiesDoc");
entitiesXmlDocument.AppendChild(root);
XmlNamedNodeMap entitiesMap = entitiesXmlDocument.DocumentType.Entities;
// build a dictionary of entity names and values
Dictionary<string, string> entitiesDictionary = new Dictionary<string, string>();
for (int i = 0; i < entitiesMap.Count; i++)
{
XmlElement entityElement = entitiesXmlDocument.CreateElement(entitiesMap.Item(i).Name);
XmlEntityReference entityRefElement = entitiesXmlDocument.CreateEntityReference(entitiesMap.Item(i).Name);
entityElement.AppendChild(entityRefElement);
root.AppendChild(entityElement);
if (!string.IsNullOrEmpty(entityElement.ChildNodes[0].InnerXml))
{
// do not add parameter entities or invalid entities
// this can be determined by checking for an empty string
entitiesDictionary.Add(entitiesMap.Item(i).Name, entityElement.ChildNodes[0].InnerXml);
}
}
答案 1 :(得分:1)
这是一种方式(未经测试),它使用此类的XMLReader和ResolveEntity()方法:
private Dictionary<string, string> GetEntities(XmlReader xr)
{
Dictionary<string, string> entityList = new Dictionary<string, string>();
while (xr.Read())
{
HandleNode(xr, entityList);
}
return entityList;
}
StringBuilder sbEntityResolver;
int extElementIndex = 0;
int resolveEntityNestLevel = -1;
string dtdCurrentTopEntity = "";
private void HandleNode(XmlReader inReader, Dictionary<string, string> entityList)
{
if (inReader.NodeType == XmlNodeType.Element)
{
if (resolveEntityNestLevel < 0)
{
while (inReader.MoveToNextAttribute())
{
HandleNode(inReader, entityList); // for namespaces
while (inReader.ReadAttributeValue())
{
HandleNode(inReader, entityList); // recursive for resolving entity refs in attributes
}
}
}
else
{
extElementIndex++;
sbEntityResolver.Append(inReader.ReadOuterXml());
resolveEntityNestLevel--;
if (!entityList.ContainsKey(dtdCurrentTopEntity))
{
entityList.Add(dtdCurrentTopEntity, sbEntityResolver.ToString());
}
}
}
else if (inReader.NodeType == XmlNodeType.EntityReference)
{
if (inReader.Name[0] != '#' && !entityList.ContainsKey(inReader.Name))
{
if (resolveEntityNestLevel < 0)
{
sbEntityResolver = new StringBuilder(); // start building entity
dtdCurrentTopEntity = inReader.Name;
}
// entityReference can have contents that contains other
// entityReferences, so keep track of nest level
resolveEntityNestLevel++;
inReader.ResolveEntity();
}
}
else if (inReader.NodeType == XmlNodeType.EndEntity)
{
resolveEntityNestLevel--;
if (resolveEntityNestLevel < 0)
{
if (!entityList.ContainsKey(dtdCurrentTopEntity))
{
entityList.Add(dtdCurrentTopEntity, sbEntityResolver.ToString());
}
}
}
else if (inReader.NodeType == XmlNodeType.Text)
{
if (resolveEntityNestLevel > -1)
{
sbEntityResolver.Append(inReader.Value);
}
}
}
答案 2 :(得分:0)
如果你有一个XmlDocument
对象,也许以递归方式逐步遍历每个XmlNode
对象(来自XmlDocument.ChildNodes
)会更容易,对于每个节点,你可以使用{{1获取节点名称的属性。然后“获取值”取决于您想要的内容(Name
用于字符串表示,InnerXml
用于编程访问ChildNodes
对象,可以转换为XmlNode
/ { {1}} / XmlEntity
)。
答案 3 :(得分:0)
只需递归地遍历树,即可轻松显示XML文档的表示。
这个小班级碰巧使用了控制台,但您可以根据需要轻松修改它。
public static class XmlPrinter {
private const Int32 SpacesPerIndent = 3;
public static void Print(XDocument xDocument) {
if (xDocument == null) {
Console.WriteLine("No XML Document Provided");
return;
}
PrintElementRecursive(xDocument.Root);
}
private static void PrintElementRecursive(XElement element, Int32 indentationLevel = 0) {
if(element == null) return;
PrintIndentation(indentationLevel);
PrintElement(element);
PrintNewline();
foreach (var xAttribute in element.Attributes()) {
PrintIndentation(indentationLevel + 1);
PrintAttribute(xAttribute);
PrintNewline();
}
foreach (var xElement in element.Elements()) {
PrintElementRecursive(xElement, indentationLevel+1);
}
}
private static void PrintAttribute(XAttribute xAttribute) {
if (xAttribute == null) return;
Console.Write("[{0}] = \"{1}\"", xAttribute.Name, xAttribute.Value);
}
private static void PrintElement(XElement element) {
if (element == null) return;
Console.Write("{0}", element.Name);
if(!String.IsNullOrWhiteSpace(element.Value))
Console.Write(" : {0}", element.Value);
}
private static void PrintIndentation(Int32 level) {
Console.Write(new String(' ', level * SpacesPerIndent));
}
private static void PrintNewline() {
Console.Write(Environment.NewLine);
}
}
使用该课程是微不足道的。以下是打印出当前配置文件的示例:
static void Main(string[] args) {
XmlPrinter.Print(XDocument.Load(
ConfigurationManager.OpenExeConfiguration(ConfigurationUserLevel.None).FilePath
));
Console.ReadKey();
}
亲自尝试,你应该能够快速修改以获得你想要的东西。