如何从html中删除特定标签

时间:2012-12-19 15:08:08

标签: c# html

  

可能重复:
  How to use HTML Agility pack

我的html代码如下:

<div><span class="help">This is text.</span>Hello, this is text.</div>
<div>I have a question.<span class="help">Hi</span></div>

现在,我想删除使用C#的<span class="help"></span>之间的文本。所以,我只想离开

<div>Hello, this is text.</div>
<div>I have a question.</div>

任何人都有任何想法?

4 个答案:

答案 0 :(得分:3)

您应该使用Html Agility Pack来处理html。

string text = @"<div><span class=""help"">This is text.</span>Hello, this is text.      </div>
                <div>I have a question.<span class=""help"">Hi</span></div>";
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(text);
var nodes = doc.DocumentNode.SelectNodes("//span[@class='help']");
foreach( HtmlNode node in nodes)
{
   node.Remove();
} 
String result = doc.DocumentNode.InnerHtml;

答案 1 :(得分:2)

我有意使用Html Agility Pack来解析html。

HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);  // this is your string
var divs = doc.DocumentNode.Elements("div")
      .Select(div => string.Format("<div>{0}</div>", div.LastChild.InnerText));

答案 2 :(得分:0)

你可以使用正则表达式

    string val = @"<div><span class=""help"">This is text.</span>Hello, this is text.</div><div>I have a question.<span class=""help"">Hi</span></div>";
        Regex reg = new Regex("<span .+?</span>", RegexOptions.IgnoreCase | RegexOptions.Singleline);
        string ret = reg.Replace(val, "");
        Debug.WriteLine(ret);

答案 3 :(得分:-2)

获取包含runat =“server”的元素,以便可以从代码隐藏中访问它们,然后在适合的时候尝试通过其id名称获取元素并执行 element.innerHTML =“”;或element.innerText =“”;