C#XML解析。需要获取文字

时间:2017-10-13 07:13:27

标签: c# xml xml-parsing

我有这样的代码:

using System;
using System.IO;
using System.Xml.Serialization;

namespace ConsoleApp1
{
    [XmlRoot(ElementName = "doc")]
    public class Doc
    {
        [XmlElement(ElementName = "headline")]
        public string Headline { get; set; }
    }

    static class Program
    {
        static void Main(string[] args)
        {
            Doc res;

            var serializer = new XmlSerializer(typeof(Doc));
            using (var reader = new StringReader(File.ReadAllText("test.xml")))
            {
                res = (Doc) serializer.Deserialize(reader);
            }

            Console.Out.WriteLine(res.Headline.ToString());
        }
    }
}

我的test.xml文件包含以下信息:

<doc>
    <headline>AZERTY on the English <hlword>QWERTY</hlword> layout.
    </headline>
</doc>

当我尝试解析它时,我有一个例外:

System.InvalidOperationException occurred
  HResult=0x80131509
  Message=There is an error in XML document (2, 35).
  Source=System.Xml
  StackTrace:
   at System.Xml.Serialization.XmlSerializer.Deserialize(XmlReader xmlReader, String encodingStyle, XmlDeserializationEvents events)
   at System.Xml.Serialization.XmlSerializer.Deserialize(TextReader textReader)
   at ConsoleApp1.Program.Main(String[] args) in D:\Documents\Visual Studio 2017\Projects\ConsoleApp1\ConsoleApp1\Program.cs:line 24

Inner Exception 1:
XmlException: Unexpected node type Element. ReadElementString method can only be called on elements with simple or empty content. Line 2, position 35.

我需要从这些文件中获取AZERTY on the English <hlword>QWERTY</hlword> layout.AZERTY on the English QWERTY layout.的输出。我需要将Headline Doc属性设置为ToString()以获取此类文本(可能调用db.collection.aggregate({ $match: { "user.id": 123 } }, { $redact: { $cond: { if: { $or: [ // those are the conditions for when to include a (sub-)document "$user", // if it contains a "user" field (as is the case when we're on the top level "$some_list", // if it contains a "some_list" field (would be the case for the "user" sub-document) "$other_list", // the same here for the "other_list" field { $eq: [ "$x", 1 ] } // and lastly, when we're looking at the innermost sub-documents, we only want to include items where "x" is equal to 1 ] }, then: "$$DESCEND", // descend into sub-document else: "$$PRUNE" // drop sub-document } } }) 属性)?

P.S。我使用Windows 10与Creators Update和VisualStudio 2017(15.3.3)

2 个答案:

答案 0 :(得分:2)

错误告诉您它无法将<headline>AZERTY on the English <hlword>QWERTY</hlword> layout.解析为简单字符串,因为它中包含元素。这称为混合型。要解析这个,你需要将XMLObject编辑为类似的东西

[XmlRoot(ElementName = "doc")]
public class Doc
{
    [XmlElement(ElementName = "headline")]
    public Headline Headline { get; set; }
}

public class Headline
{
    [XmlText]
    public string Content { get; set; }

    [XmlElement(ElementName = "hlword")]
    public string HlWord { get; set; }
}

答案 1 :(得分:0)

您收到错误的原因是headline-element内容中的hlword-tag。如果您将内容包装好,则不会解析内容,而是按原样读取。

<doc>
    <headline><![CDATA[AZERTY on the English <hlword>QWERTY</hlword> layout.]]></headline>
</doc>