使用Regex清理字体大小标记

时间:2012-09-12 13:52:59

标签: c# .net regex

我尝试使用正则表达式从

清除<font style="font-size:85%;font-family:arial,sans-serif">
font-size:85%;

我的正则表达式为^font-size:(*);

我的意思是我必须完全删除font-size标签。

有人可以帮我吗?

谢谢!

2 个答案:

答案 0 :(得分:3)

这是你需要的正则表达式:

string html = @"<font style=""font-size:85%;font-family:arial,sans-serif"">";
string pattern = @"font-size\s*?:.*?(;|(?=""|'|;))";
string cleanedHtml = Regex.Replace(html, pattern, string.Empty);

即使在font-sizept中定义了em,或者定义了一组不同的CSS样式(即。font-family),此正则表达式仍然有效指定)。您可以看到结果here

正则表达式的解释如下:

// font-size\s*?:.*?(;|(?="|'|;))
// 
// Match the characters “font-size” literally «font-size»
// Match a single character that is a “whitespace character” (spaces, tabs, and line breaks) «\s*?»
//    Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
// Match the character “:” literally «:»
// Match any single character that is not a line break character «.*?»
//    Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
// Match the regular expression below and capture its match into backreference number 1 «(;|(?="|'|;))»
//    Match either the regular expression below (attempting the next alternative only if this one fails) «;»
//       Match the character “;” literally «;»
//    Or match regular expression number 2 below (the entire group fails if this one fails to match) «(?="|'|;)»
//       Assert that the regex below can be matched, starting at this position (positive lookahead) «(?="|'|;)»
//          Match either the regular expression below (attempting the next alternative only if this one fails) «"»
//             Match the character “"” literally «"»
//          Or match regular expression number 2 below (attempting the next alternative only if this one fails) «'»
//             Match the character “'” literally «'»
//          Or match regular expression number 3 below (the entire group fails if this one fails to match) «;»
//             Match the character “;” literally «;»

答案 1 :(得分:3)

当前正则表达式的一些内容会导致它失败:

^font-size:(*);

您正在锚定到行^的开头 - 该属性不在该行的开头。

*本身就没有任何意义。

将其更改为:

font-size: ?\d{1,2}%;