我正在解析代表研究论文/ artciles的XML文件,并且在XML模式下面存储在Java中的MYSQL数据库中
<article>
<article-meta></article-meta>
<body>
<p>
Extensible Markup Language (XML) is a markup language that defines a set of
rules for encoding documents in a format that is both human-readable and machine-
readable <ref id = 1>. It is defined in the XML 1.0 Specification produced by the
W3C, and several other related specifications
</p>
<p>
Many application programming interfaces (APIs) have been developed to aid
software developers with processing XML <ref id = 2>. data, and several schema
systems exist to aid in the definition of XML-based languages.
</p>
</body>
<back>
<ref-list>
<ref id = 1>Details about this reference </ref>
<ref id = 2>Details about this reference </ref>
</ref-list>
</back>
</article>
我正在使用DOM解析器解析文件。其中一个要求是每个 ref id ,我必须从body标签中引用的位置左右提取150个字符。我怎么能这样做?
refId leftText rightText
1 left 150 150 chars on right side
答案 0 :(得分:0)
假设您使用dom从代码中获取了<ref>
标记元素Id = 1
和元素content value = Details about this reference
,将<ref> tag
内容值存储在字符串变量中,那么您可以使用sub string方法得到左边的char和右边的char。就这样。
String text ="Details about this reference";
String leftText = text.substring(0,7); // get 7 chars from left side
String rightText =text.substring(text.length()-2); // get 2 char from right side, instead of 2 you have to pass10
结果
leftText:Details rightText:ce
注意:在提取之前需要检查字符串长度大于150,如果少于substring则会抛出异常ArayIndexBoundOfException