使用java进行生物学中的XML数据集

时间:2015-12-08 09:15:10

标签: java xml

我在java中编写了一个代码,用于解析一个意图使用的xml文件。现在我有问题了。我的数据集是AIMed,是这样的:

<passage>
    <text>
        Isolation of human delta-catenin and its binding specificity with presenilin 1.
        We screened proteins for interaction with presenilin (PS) 1, and cloned the full-length cDNA of human delta-catenin, which encoded 1225 amino acids.
        Yeast two-hybrid assay, GST binding assay and immunoprecipitation demonstrated that delta-catenin interacted with a hydrophilic loop region in the endoproteolytic C-terminal fragment of PS1, but not with that of PS-2.
        These results suggest that PS1 and PS2 partly differ in function.
        PS1 loop fragment containing the pathogenic mutation retained the binding ability.
        We also found another armadillo-protein, p0071, interacted with PS1.
    </text>
    <annotation id="T1">
        <infon key="file">ann</infon>
        <infon key="type">protein</infon>
        <location offset="19" length="13"></location>
        <text>delta-catenin</text>
    </annotation>
    <relation id="R3">
        <infon key="relation type">Interaction</infon>
        <infon key="file">ann</infon>
        <infon key="type">Relation</infon>
        <node refid="T5" role="Arg1"></node>
        <node refid="T6" role="Arg2"></node>
    </relation>
</passage>

我正在使用SAXParser,我的代码就像这样(用于文本标记):

else if (bText) 
{
     System.out.println("Text: " 
     + new String(ch, start, length));
     bText = false;
}

但它只显示了两种情况。我的问题是如何解决它?

1 个答案:

答案 0 :(得分:0)

遍历NodeList中的节点,直到找到相应的节点,将其强制转换为元素(在案例文本中),然后使用element.getTextContent()。 请参阅Interface Node,并认为它还将返回节点后代的文本(如果存在)。