XML检查重复项

时间:2015-05-22 07:27:24

标签: java xml

我一直在使用XML文件几天,并且我已经尝试开发一个程序来检查我的XML中是否有一些英文单词已被多次写入。

我的XML文件如下所示:

<VOCABULAR>
    <cuvant>
        <cuvE>to go</cuvE>
        <exE>Where to go?
            Go home!
        </exE>
        <cuvR>a merge</cuvR>
        <exR>Unde mergi?
            Mergi acasa!
        </exR>
    </cuvant>
    <cuvant>
        <cuvE>to listen</cuvE>
        <exE>Listen to me!
            I like to listen classical music
        </exE>
        <cuvR>a asculta</cuvR>
        <exR>Asculta-ma!
            Imi place sa ascult muzica clasica
        </exR>
    </cuvant>
    <cuvant>
        <cuvE>to arrive</cuvE>
        <exE>When do you arrive at home ?</exE>
        <cuvR>a ajung</cuvR>
        <exR>Cand ajungi acasa ?</exR>
    </cuvant>
    <cuvant>
        <cuvE>to go</cuvE>
        <exE>Where to go?
            Go home!
        </exE>
        <cuvR>a merge</cuvR>
        <exR>Unde mergi?
            Mergi acasa!
        </exR>
    </cuvant>
</VOCABULAR>

和我的Java代码:

public class Cuvant {

    public String cuvR;
    public String exR;
    public String cuvE;
    public String exE;

}

class ParsareVocabular {

    static XMLStreamReader reader;

    public static void main(String args[]) throws XMLStreamException{

        List<Cuvant> al = null;
        Cuvant cuvCrt = null;
        String continutTag = null;
        XMLInputFactory factory = XMLInputFactory.newInstance();
        try {
            reader = factory.createXMLStreamReader(new FileInputStream(new File("vocabular.xml")));
        } catch (IOException e) {
            e.printStackTrace();
        }

        while(reader.hasNext()){

            int event = reader.next();

            switch(event){

                case XMLStreamConstants.START_ELEMENT:
                    if("VOCABULAR".equals(reader.getLocalName()))
                        al=new ArrayList<Cuvant>();
                    else if("cuvant".equals(reader.getLocalName()))
                        cuvCrt = new Cuvant();
                    break;
                case XMLStreamConstants.CHARACTERS:
                    continutTag = reader.getText().trim();
                    break;
                case XMLStreamConstants.END_ELEMENT:
                    switch (reader.getLocalName()) {
                        case "cuvant":
                            al.add(cuvCrt);
                            break;
                        case "cuvR":
                            cuvCrt.cuvR = continutTag;
                            break;
                        case "exR":
                            cuvCrt.exR = continutTag;
                            break;
                        case "cuvE":
                            cuvCrt.cuvE = continutTag;
                            break;
                        case "ExE":
                            cuvCrt.exE = continutTag;
                            break;
                    }
                break;
            }
        }

        Iterator<Cuvant> it = al.iterator();
        int diferit = 0;

        while(it.hasNext()){
            Cuvant c = it.next();
            if(c.cuvE.equals(it.next()))
                diferit++;
            //System.out.println(c.cuvE);
        }

        /*for(int i=0;i<al.size();i++){
            Cuvant c = it.next();

        }*/
        if(diferit==0)
            System.out.println("no duplicates");
        else
            System.out.println("are duplicates");

        System.out.println("Total words in english: "+al.size());
    }
}

我搜索了一个解决方案,但没有解决我的问题。如果你能帮我提一个建议,请做。非常感谢!

1 个答案:

答案 0 :(得分:0)

while (it.hasNext()) {
  Cuvant c = it.next();
  if (c.cuvE.equals(it.next()))
    diferit++;
  //System.out.println(c.cuvE);
}

此代码的问题是您试图在相邻元素中查找重复项,但重复元素相距很远。

您需要获取第一个元素,然后将其与所有下一个元素匹配,然后获取第二个元素并将其与所有下一个元素匹配。这样您就可以找到重复的元素。

修改

您可以使用以下代码查找cuvE

的重复项
int diferit = 0;

for(int i=0;i<al.size();i++) {
   Cuvant c = al.get(i);
   for(int j=i+1; j <al.size(); j++) {
      if(c.cuvE.equals(al.get(j).cuvE))
           diferit++;
      }
   }
}