Question

我在java中编写了一个代码，用于解析一个意图使用的xml文件。现在我有问题了。我的数据集是AIMed，是这样的：

<passage>
    <text>
        Isolation of human delta-catenin and its binding specificity with presenilin 1.
        We screened proteins for interaction with presenilin (PS) 1, and cloned the full-length cDNA of human delta-catenin, which encoded 1225 amino acids.
        Yeast two-hybrid assay, GST binding assay and immunoprecipitation demonstrated that delta-catenin interacted with a hydrophilic loop region in the endoproteolytic C-terminal fragment of PS1, but not with that of PS-2.
        These results suggest that PS1 and PS2 partly differ in function.
        PS1 loop fragment containing the pathogenic mutation retained the binding ability.
        We also found another armadillo-protein, p0071, interacted with PS1.
    </text>
    <annotation id="T1">
        <infon key="file">ann</infon>
        <infon key="type">protein</infon>
        <location offset="19" length="13"></location>
        <text>delta-catenin</text>
    </annotation>
    <relation id="R3">
        <infon key="relation type">Interaction</infon>
        <infon key="file">ann</infon>
        <infon key="type">Relation</infon>
        <node refid="T5" role="Arg1"></node>
        <node refid="T6" role="Arg2"></node>
    </relation>
</passage>

我正在使用SAXParser，我的代码就像这样（用于文本标记）：

else if (bText) 
{
     System.out.println("Text: " 
     + new String(ch, start, length));
     bText = false;
}

但它只显示了两种情况。我的问题是如何解决它？

Answer 1

遍历NodeList中的节点，直到找到相应的节点，将其强制转换为元素（在案例文本中），然后使用element.getTextContent（）。请参阅Interface Node，并认为它还将返回节点后代的文本（如果存在）。

使用java进行生物学中的XML数据集

1 个答案: