如何在C#中进行XML到JSON转换期间忽略HTML内容的#cdata-section

时间:2017-11-02 11:54:39

标签: c# json xml

我正在尝试将包含html代码的xml转换为json,使用Newtonsoft json in c sharp,

  <Content>
   <richtext> <![CDATA[<p> <strong>This is sample richtext content </strong> </p> ]]</richtext>
   <htmlcontent><![CDATA[ <p> <strong>This is html content </strong> ]]</p> </htmlcontent>
   <others> sample </others>
  </Content>

我的C#代码是

string xmlContent = @"<Content><richtext><![CDATA[ <p> <strong>This is sample richtext content </strong> </p> ]]></richtext><htmlcontent> <![CDATA[<p> <strong>This is html content </strong> </p> ]]></htmlcontent><others> sample </others></Content>";
doc.LoadXml(xmlContent);
string jsonText = JsonConvert.SerializeXmlNode(doc, Newtonsoft.Json.Formatting.Indented);
Console.WriteLine("JSON is :" + jsonText);

我的输出是

    {
  "Content": {
    "richtext": {
      "#cdata-section": " <p> <strong>This is sample richtext content </strong> </p> "
    },
    "htmlcontent": {
      "#cdata-section": "<p> <strong>This is html content </strong> </p> "
    },
    "others": " sample "
  }
}

我的预期输出是

{
  "Content": {
    "richtext": "<p> <strong>This is sample richtext content </strong> </p>",
    "htmlcontent": "<p> <strong>This is html content </strong> </p>",
    "others": " sample "
  }
}

有没有办法在JSON转换期间删除XML中的#cdata-section元素。

1 个答案:

答案 0 :(得分:1)

从文档中删除CDATA节点。将HTML粘贴为原始数据 - 将插入并转义标记。

让我们使用Linq2Xml而不是XmlDocument。它更方便。

string xmlContent = @"<Content><richtext><![CDATA[ <p> <strong>This is sample richtext content </strong> </p> ]]></richtext><htmlcontent> <![CDATA[<p> <strong>This is html content </strong> </p> ]]></htmlcontent><others> sample </others></Content>";
var doc = XElement.Parse(xmlContent);

var cdata = doc.DescendantNodes().OfType<XCData>().ToList();
foreach(var cd in cdata)
{
    cd.Parent.Add(cd.Value);
    cd.Remove();
}

Console.WriteLine(doc);

string jsonText = JsonConvert.SerializeXNode(doc, Newtonsoft.Json.Formatting.Indented);
Console.WriteLine(jsonText);