XML解析 - 搜索特定元素

时间:2016-06-01 14:07:39

标签: java xml parsing search

我有一个XML文档,我需要解析它才能从中提取特定的值。 架构与此类似:

    <sequence tag="771b,1030" vr="SQ" card="2" len="988" name="axial_length_values_sequence">
        <item card="6" len="486">
            <element tag="771b,0000" vr="UL" vm="1" len="4" name="PrivateGroupLength">474</element>
            <element tag="771b,0010" vr="LO" vm="1" len="6" name="PrivateCreator">99CZM</element>
            <element tag="771b,1008" vr="CS" vm="1" len="2" name="laterality">R</element>
            <element tag="771b,1043" vr="FD" vm="1" len="8" name="mean_value_al">27.649999999999999</element>
            <element tag="771b,1044" vr="FD" vm="1" len="8" name="mean_value_snr">272.5</element>
        </item>
        <item card="6" len="486">
            <element tag="771b,0000" vr="UL" vm="1" len="4" name="PrivateGroupLength">474</element>
            <element tag="771b,0010" vr="LO" vm="1" len="6" name="PrivateCreator">99CZM</element>
            <element tag="771b,1008" vr="CS" vm="1" len="2" name="laterality">L</element>
            <element tag="771b,1043" vr="FD" vm="1" len="8" name="mean_value_al">27.0100000000000016</element>
            <element tag="771b,1044" vr="FD" vm="1" len="8" name="mean_value_snr">151.90000000000001</element>
        </item>
    </sequence>
    <sequence tag="771b,1032" vr="SQ" card="2" len="1268" name="keratometer_values_sequence">
        <item card="13" len="626">
            <element tag="771b,0000" vr="UL" vm="1" len="4" name="PrivateGroupLength">614</element>
            <element tag="771b,0010" vr="LO" vm="1" len="6" name="PrivateCreator">99CZM</element>
            <element tag="771b,1008" vr="CS" vm="1" len="2" name="laterality">R</element>
            <element tag="771b,1016" vr="FD" vm="1" len="8" name="refractive_index">1.3374999999999999</element>
            <element tag="771b,1017" vr="FD" vm="1" len="8" name="quali_tag">0</element>
            <element tag="771b,1049" vr="FD" vm="1" len="8" name="mean_value_r1">8.5199999999999996</element>
            <element tag="771b,104a" vr="FD" vm="1" len="8" name="mean_value_d1">39.609999999999999</element>
            <element tag="771b,104b" vr="FD" vm="1" len="8" name="mean_value_a1">174</element>
            <element tag="771b,104c" vr="FD" vm="1" len="8" name="mean_value_r2">8.4499999999999993</element>
            <element tag="771b,104d" vr="FD" vm="1" len="8" name="mean_value_d2">39.939999999999998</element>
            <element tag="771b,104e" vr="FD" vm="1" len="8" name="mean_value_a2">84</element>
            <element tag="771b,104f" vr="FD" vm="1" len="8" name="mean_value_zyl">0.33000000000000003</element>
        </item>
        <item card="13" len="626">
            <element tag="771b,0000" vr="UL" vm="1" len="4" name="PrivateGroupLength">614</element>
            <element tag="771b,0010" vr="LO" vm="1" len="6" name="PrivateCreator">99CZM</element>
            <element tag="771b,1008" vr="CS" vm="1" len="2" name="laterality">L</element>
            <element tag="771b,1016" vr="FD" vm="1" len="8" name="refractive_index">1.3374999999999999</element>
            <element tag="771b,1017" vr="FD" vm="1" len="8" name="quali_tag">0.01</element>
            <element tag="771b,1049" vr="FD" vm="1" len="8" name="mean_value_r1">8.4800000000000004</element>
            <element tag="771b,104a" vr="FD" vm="1" len="8" name="mean_value_d1">39.799999999999997</element>
            <element tag="771b,104b" vr="FD" vm="1" len="8" name="mean_value_a1">167</element>
            <element tag="771b,104c" vr="FD" vm="1" len="8" name="mean_value_r2">8.3399999999999999</element>
            <element tag="771b,104d" vr="FD" vm="1" len="8" name="mean_value_d2">40.469999999999999</element>
            <element tag="771b,104e" vr="FD" vm="1" len="8" name="mean_value_a2">77</element>
            <element tag="771b,104f" vr="FD" vm="1" len="8" name="mean_value_zyl">0.67000000000000002</element>
        </item>
    </sequence>

要解析其他4个“序列”元素。

对于每个“序列”元素,我需要提取以下值:     [R

并根据值(如果R或L)我需要保存两次特定值,一次为左(“L”),一次为右(“R”) 例如: tag =“771b,1044”的正确值将是: 的 “272.5” 而左边将是: 的 “151.90000000000001”

我失去了理智!!!!谁能帮我? 如果我搜索特定标签,我可以获得单个值,但我找不到如何首先搜索“R”,然后仅查找与“R”相关联的值,然后重复搜索“L”并获取相关值! !考虑“R”并不总是第一个元素(也可能是“L”)。 任何帮助将非常感谢。谢谢大家!!

1 个答案:

答案 0 :(得分:1)

使用JSOUP:https://jsoup.org/

我将你的xml复制到一个文件test.xml中并用JSOUP解析它:

final Document doc = Jsoup.parse(new File(".\\test.xml"), "UTF-8");

String tag;
BigDecimal left=new BigDecimal(0);
BigDecimal right=new BigDecimal(0);

for (Element sequence : doc.select("sequence")) {
    tag = sequence.attr("tag");

    for (Element item : sequence.select("element[name='laterality']")) {

        String value="";

        if(tag.equals("771b,1030")) value = item.siblingElements().select("element[name='mean_value_snr']").text();
        //specify correct name for other sequences here

        if(!value.isEmpty()){
            if(item.text().equals("L")) left = new BigDecimal(value);
            if(item.text().equals("R")) right = new BigDecimal(value);
        }else{
            left=new BigDecimal(0);
            right=new BigDecimal(0);
        }
    }

    System.out.println(tag + ": " + "L mean_value=" + left + " | R mean_value=" + right);
}

打印出来:

771b,1030: L mean_value=151.90000000000001 | R mean_value=272.5
771b,1032: L mean_value=0 | R mean_value=0

UPDATE:用BigDecimal替换double以避免精度损失