修复/替换C#源代码中抓取的xhtml实体的最佳方法是什么?

时间:2011-05-31 17:24:27

标签: c# xhtml

基本上,我正在抓取包含&>等实体的源代码,我希望将其替换为&和>分别。简单的字符串替换将是完全详尽的,因为有数百个可能的实体可以显示在源代码中。是否有内置或标准的方法,所以我不必键入一百行的字符串替换?

谢谢! :)

2 个答案:

答案 0 :(得分:2)

也许WebUtility.HtmlDecode方法就是你要找的东西

本文给出了一个使用Powershell来调用它的例子

HTMLDecode Sample - Using PowerShell
<#
.SYNOPSIS
    This script encodes and decodes an HTML String
.DESCRIPTION
    This script used 
.NOTES
    File Name  : Show-HtmlCoding.ps1
    Author     : Thomas Lee - tfl@psp.co.uk
    Requires   : PowerShell Version 2.0
.LINK
    This script posted to:
        http://www.pshscripts.blogspot.com
    MSDN sample posted tot:
        http://msdn.microsoft.com/en-us/library/ee388364.aspx
.EXAMPLE
    PSH [C:\foo]: .\Show-HtmlCoding.ps1
    Original String: <this is a string123> & so is this one??
    Encoded String : &lt;this is a string123&gt; &amp; so is this one??
    Decoded String : <this is a string123> & so is this one??
    Original string = Decoded string?: True   
#>

# Create string to encode/decode
$Str = "<this is a string123> & so is this one??"

# Encode String
$Encstr = [System.Net.WebUtility]::HtmlEncode($str)

# Decode String
$Decstr = [System.Net.WebUtility]::HtmlDecode($EncStr)

# Display strings
"Original String: {0}" -f $Str
"Encoded String : {0}" -f $Encstr
"Decoded String : {0}" -f $Decstr
$eq = ($str -eq $Decstr)
"Original string = Decoded string?: {0}" -f $eq 

答案 1 :(得分:0)

我会使用sed控制台命令并为此创建一个小脚本。尝试谷歌sed 1行。我打赌你喜欢它。