"outline-style: none; margin: 0px; padding: 2px; background-color: #eff0f8; color: #3b3a39; font-family: Georgia,'Times New Roman',Times,serif; font-size: 14px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: 18px; orphans: 2; text-align: center; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; border: 1px solid #ebebeb; float: left;"
我将此作为内联css。我想用空格替换所有以“background”和“font”开头的属性,使用正则表达式。在内联css中,最后一个属性可能没有半冒号作为结束
我使用此代码作为django过滤器,使用漂亮的汤从服务器端删除这些属性
def html_remove_attrs(value):
soup = BeautifulSoup(value)
print "hi"
for tag in soup.findAll(True,{'style': re.compile(r'')}):
#tag.attrs = None
#for attr in tag.attrs:
# if "class" in attr:
# tag.attrs.remove(attr)
# if "style" in attr:
# tag.attrs.remove(attr)
for attr in tag.attrs:
if "style" in attr:
#remove the background and font properties
return soup
答案 0 :(得分:2)
我不知道您的编程环境的细节,但是您要求使用正则表达式。此正则表达式将查找属性键(加冒号和任何空格)作为组1($1
)和属性值作为组2($2
):
((?:background|font)(?:[^:]+):(?:\\s*))([^;]+)
表达式不会删除属性值。它找到了它们。如何删除它们取决于您的编程环境(语言/库)。
但基本上,您将进行全局查找/替换,将整个结果替换为$1
。
例如,使用Java可以做到这一点
public static void main(String[] args) throws Exception {
String[] lines = {
"outline-style: none; margin: 0px; padding: 2px; background-color: #eff0f8; color: #3b3a39; font-family: Georgia,'Times New Roman',Times,serif; font-size: 14px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: 18px; orphans: 2; text-align: center; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; border: 1px solid #ebebeb; float: left;",
"outline-style: none; margin: 0px; padding: 2px; background-color: #eff0f8; color: #3b3a39; font-family: Georgia,'Times New Roman',Times,serif; font-size: 14px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: 18px; orphans: 2; text-align: center; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; border: 1px solid #ebebeb; float: left",
"background-color: #eff0f8;",
"background-color: #eff0f8",
};
String regex = "((?:background|font)(?:[^:]+):(?:\\s*))([^;]+)";
Pattern p = Pattern.compile(regex);
for (String s: lines) {
StringBuffer sb = new StringBuffer();
Matcher m = p.matcher(s);
while (m.find()) {
// capturing group(2) for debug purpose only
// just to get it's length so we can fill that with '-'
// to assist comparison of before and after
String text = m.group(2);
text = text.replaceAll(".", "-");
m.appendReplacement(sb, "$1"+text);
// for non-debug mode, just use this instead
// m.appendReplacement(sb, "$1");
}
m.appendTail(sb);
System.err.println("> " + s); // before
System.err.println("< " +sb.toString()); // after
System.err.println();
}
}