鉴于:
像
这样的XML结构<span class="abbreviation">AGB<span class"explanation">Allgemeine Geschäftsbedingungen</span></span>
转换后的结果应为:
<abbr title="Allgemeine Geschäftsbedingungen">AGB</abbr>
我知道SAX是一个基于事件的XML解析器,并且使用像</ p>这样的方法
#startElement(...)
#endElement(...)
我可以捕捉事件(例如open-a-tag
,close-a-tag
)和
#characters
我可以在标签之间提取文字。
我的问题是:
我可以创建上面提到的转换(是否可能)?
我的问题是:
答案 0 :(得分:0)
答案是是它可能!
您可以从此StackOverflow-link
获取主要参数/提示这是必须做的事情:
#character
方法完成)abbr
- 标记为了完整性,这里是coremedia cae过滤器的源代码:
import com.coremedia.blueprint.cae.richtext.filter.FilterFactory;
import com.coremedia.xml.Filter;
import org.apache.commons.lang3.StringUtils;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.AttributesImpl;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
public class GlossaryFilter extends Filter implements FilterFactory {
private static final String SPAN = "span";
private static final String CLASS = "class";
private boolean isAbbreviation = false;
private boolean isExplanation = false;
private String abbreviation;
private String currentUri;
private boolean spanExplanationClose = false;
private boolean spanAbbreviationClose = false;
@Override
public Filter getInstance(final HttpServletRequest request, final HttpServletResponse response) {
return new GlossaryFilter();
}
@Override
public void startElement(final String uri, final String localName, final String qName,
final Attributes attributes) throws SAXException {
if (isSpanAbbreviationTag(qName, attributes)) {
isAbbreviation = true;
} else if (isSpanExplanationTag(qName, attributes)) {
isExplanation = true;
currentUri = uri;
} else {
super.startElement(uri, localName, qName, attributes);
}
}
private boolean isSpanExplanationTag(final String qName, final Attributes attributes) {
//noinspection OverlyComplexBooleanExpression
return StringUtils.isNotEmpty(qName) && qName.equalsIgnoreCase(SPAN) && (
attributes.getLength() > 0) && attributes.getValue(CLASS).equals("explanation");
}
private boolean isSpanAbbreviationTag(final String qName, final Attributes attributes) {
//noinspection OverlyComplexBooleanExpression
return StringUtils.isNotEmpty(qName) && qName.equalsIgnoreCase(SPAN) && (
attributes.getLength() > 0) && attributes.getValue(CLASS).equals("abbreviation");
}
@Override
public void endElement(final String uri, final String localName, final String qName)
throws SAXException {
if (spanExplanationClose) {
spanExplanationClose = false;
} else if (spanAbbreviationClose) {
spanAbbreviationClose = false;
} else {
super.endElement(uri, localName, qName);
}
}
@Override
public void characters(final char[] ch, final int start, final int length) throws SAXException {
if (isAbbreviation && isExplanation) {
final String explanation = new String(ch, start, length);
final AttributesImpl newAttributes = createAttributes(explanation);
writeAbbrTag(newAttributes);
changeState();
} else if (isAbbreviation && !isExplanation) {
abbreviation = new String(ch, start, length);
} else {
super.characters(ch, start, length);
}
}
private void changeState() {
isExplanation = false;
isAbbreviation = false;
spanExplanationClose = true;
spanAbbreviationClose = true;
}
@SuppressWarnings("TypeMayBeWeakened")
private void writeAbbrTag(final AttributesImpl newAttributes) throws SAXException {
super.startElement(currentUri, "abbr", "abbr", newAttributes);
super.characters(abbreviation.toCharArray(), 0, abbreviation.length());
super.endElement(currentUri, "abbr", "abbr");
}
private AttributesImpl createAttributes(final String explanation) {
final AttributesImpl newAttributes = new AttributesImpl();
newAttributes.addAttribute(currentUri, "title", "abbr:title", "CDATA", explanation);
return newAttributes;
}
}
有趣的东西在方法中:
startElement(...)
endElement(...)
characters(...)
这里存储sax-parser所在标签的状态(更详细:存储状态,span-tag(&#34; class =缩写&#34;或&#34; class =说明& #34;)被打开了。
isAbbreviation
用于打开的span标记,其中&#34; class =缩写&#34; isExplanation
打开的span标记,其中&#34; class =说明&#34; 您只存储州。提到的span-tags将不会被处理/过滤(结果是,它们将被删除)。每个其他标记都经过处理而不进行过滤,它们将在不加修改的情况下应用(即else
- 块)。
在这里,您只想处理除(上述span-tags)之外的每个标记。所有这些标签都不加修改地应用(else
- 块)。如果sax解析器位于一个封闭的span-tag(使用&#34; class =缩写&#34;或&#34; class = explain&#34;),你想什么都不做(除了存储状态)
在这种方法中,魔术(用解析器创建一个标签)就会发生。取决于州:
(isAbbreviation && isExplanation)
isAbbreviation && !isExplanation
)else
状态3。
只需复制您找到的文字
状态2。
使用&#34; class = abbreviation&#34;提取span-tag的内容。供以后使用
状态3。
abbr
- 代码(title=....
)abbr
- 代码(而不是两个span-tags)