我错过了什么吗?有更好的方法吗?
INPUT:
<span style="FONT-FAMILY: 'Lucida Sans','sans-serif'; COLOR: #003572; FONT-SIZE: 9pt;
mso-fareast-font-family: Calibri; mso-ansi-language: EN-US; mso-fareast-language: EN-US;
mso-bidi-language: AR-SA; mso-fareast-theme-font: minor-latin">Dr. Who is
<u>usually</u> available for consultations Mon - Thurs afternoons and Friday 9a-
12p at 555-1212. </span>
期望的输出:
&lt; span style =“COLOR:#003572; FONT-SIZE:9pt;”&gt; Dr。谁是 &LT; U&GT;通常&LT; / U&GT;可供周一至周四的咨询 下午和周五9a-12p在555-1212。 &LT; /跨度&GT;
我的代码很远:
//在写入数据库
之前清除周长注释中的HTMLWhitelist wl = new Whitelist(); wl = Whitelist.simpleText(); wl.addTags("br"); wl.addTags("p"); wl.addTags("span"); wl.addAttributes(":all","style"); Document doc = Jsoup.parse( "<html><head></head><body>"+ds.getWeeklongNote()+"</body></html>"); Elements e = doc.select("*"); for (Element el : e){ for (Attribute attr : el.attributes()){ if (attr.getKey().equals("span")){ String newValue = ""; String s = attr.getValue(); String[] values = s.split(";"); for (String value : values){ if (value.startsWith("COLOR")||value.startsWith("FONT-SIZE")){ newValue += attr.getKey()+"="+attr.getValue()+";"; } } attr.setValue(newValue); } } } doc.html(e.outerHtml()); ds.setWeekLongNote(Jsoup.clean(doc.body().outerHtml(), wl));
答案 0 :(得分:1)
试试这个:
Document doc = Jsoup.parse(html);
Elements e = doc.getElementsByTag("body");
Log.i("Span element: "+e.get(0).nodeName(), ""+e.get(0).nodeName());
e = e.get(0).getElementsByTag("span");
Attributes styleAtt = e.get(0).attributes();
Attribute a = styleAtt.asList().get(0);
if(a.getKey().equals("style")){
String[] items = a.getValue().trim().split(";");
String newValue = "";
for(String item: items){
if(item.contains("COLOR:")||item.contains("FONT-SIZE:")){
Log.i("Style Item: ", ""+item);
newValue = newValue.concat(item).concat(";");
}
}
a.setValue(newValue);
Log.i("New Atrrbute: ",""+newValue);
}
Log.i("FINAL HTML: ",""+e.outerHtml());
doc.html(e.outerHtml());
}
输出:
08-17 18:28:07.692: I/FINAL HTML:(8148): <span style=" COLOR: #003572; FONT-SIZE: 9pt;">Dr. Who is <u>usually</u> available for consultations Mon - Thurs afternoons and Friday 9a- 12p at 555-1212. </span>
干杯,
答案 1 :(得分:0)
如果您有多个span元素,则可以使用此代码段:
Document document = Jsoup.parse(html);
Vector<String> allowedItems = new Vector<String>();
allowedItems.add("color");
allowedItems.add("font-size");
Elements e = document.getElementsByTag("span");
for (Element element : e) {
String[] styles = element.attr("style").split(";");
Vector<String> filteredItems = new Vector<String>();
for (String item : styles) {
String key = (item.split(":"))[0].trim().toLowerCase();
if ( allowedItems.contains(key) ){
filteredItems.add(item);
}
}
if( filteredItems.size() == 0 ){
element.removeAttr("style");
}else{
element.attr("style",StringUtils.join(filteredItems, ";"));
}
}
答案 2 :(得分:-2)
//remove style attribute
Elements elms = doc.select("*").not("img");
for (Element e : elms) {
String attr = e.attr("style");
if(!"".equals(attr) || null!=attr){
e.attr("style", "");
}
}