在Java程序问题中读写

时间:2009-12-05 11:38:36

标签: java sorting string

这不是关于读或写的一般性问题。我用Java编写了一个程序来读取图像中某些元数据的文本文件。它们包含名称和一长串列表,有时超过4000个名称。不幸的是,其中许多名称是相同的,因此我编写了一个程序,它将列表放在.txt文件中,并删除重复项,并将新清理和按字母顺序排序的列表输出到输出txt文件。此外,该程序还为每个名称添加了html列表标记,以便我可以将它们复制粘贴到我需要的任何地方。

示例文本文件

Chatty Little Kitty
Chatty Little Kitty
Bearly Nuf Taz
得到了Lil Pepto

等。等

你可以在这里看到我用来测试http://www.megaupload.com/?d=WNXYVHEN

的那个

然而,它似乎无法正常工作,因为我的输出文件中仍然有重复项。但是,我写的代码对我来说似乎是正确的,这就是为什么我在询问我是如何设置读写的问题。

我的代码

/* * This program takes in a text file that has a bunch of words listed. It then creates a single alphabetically * organized html list from that data. It also strips the data of dupblicates. */

import java.io.*; import java.util.Arrays;

public class readItWriteIt {
public static void main(String args[]) { int MAX = 10000; String[] lines = new String[MAX]; boolean valid = true;

    try{
    //Set up Input
    FileInputStream fstream = new FileInputStream("test.txt");
    DataInputStream in = new DataInputStream(fstream);
    BufferedReader br = new BufferedReader(new InputStreamReader(in));
    String strLine;


    //Set up Output
    FileWriter ostream = new FileWriter("out.txt");
    BufferedWriter out = new BufferedWriter(ostream);

    //counters
    int count = 0;
    int second_count = 0;

    //start reading in lines from the file
    while ((strLine = br.readLine()) != null){   

    //check to make sure that there aren't duplicates. If a line is the same as another line 
    //set boolean valid to false else set to true.
    if((second_count++ > 0) && (count > 0)){
        for(int i=0; i < count; i++)
        {
            if(lines[i].equals(strLine)){
                valid = false;
            }
            else
            {
                valid = true;
            }
        }
    }


    //only copy the line to the local array if it is not a duplicate. Else do nothing with it.  
        if (valid == true){
            lines[count] = strLine.trim();
            count++;
        }
        else{}
      second_count++;
    }

    //create a second array so that you can get rid of all the null values. It is the size of the 
    //used length in the first array called "lines"
    String[] newlines = new String[count];

    //copy data from array lines to array called newlines
    for(int i = 0; i < count; i++){ 
        newlines[i] = lines[i];
    }

    //sort the array alphabetically
    Arrays.sort(newlines);

    //write it out to file in alphabetical order along with the list syntax for html
    for(int i = 0; i < count; i++)
    {
        out.write("<li>" + newlines[i] + "</li>");
        out.newLine();
    }

    //close I/O
    in.close();
    out.close();

    }catch (Exception e){//Catch exception if any
      System.err.println("Error: " + e.getMessage());
    }
  }

try{ //Set up Input FileInputStream fstream = new FileInputStream("test.txt"); DataInputStream in = new DataInputStream(fstream); BufferedReader br = new BufferedReader(new InputStreamReader(in)); String strLine; //Set up Output FileWriter ostream = new FileWriter("out.txt"); BufferedWriter out = new BufferedWriter(ostream); //counters int count = 0; int second_count = 0; //start reading in lines from the file while ((strLine = br.readLine()) != null){ //check to make sure that there aren't duplicates. If a line is the same as another line //set boolean valid to false else set to true. if((second_count++ > 0) && (count > 0)){ for(int i=0; i < count; i++) { if(lines[i].equals(strLine)){ valid = false; } else { valid = true; } } } //only copy the line to the local array if it is not a duplicate. Else do nothing with it. if (valid == true){ lines[count] = strLine.trim(); count++; } else{} second_count++; } //create a second array so that you can get rid of all the null values. It is the size of the //used length in the first array called "lines" String[] newlines = new String[count]; //copy data from array lines to array called newlines for(int i = 0; i < count; i++){ newlines[i] = lines[i]; } //sort the array alphabetically Arrays.sort(newlines); //write it out to file in alphabetical order along with the list syntax for html for(int i = 0; i < count; i++) { out.write("<li>" + newlines[i] + "</li>"); out.newLine(); } //close I/O in.close(); out.close(); }catch (Exception e){//Catch exception if any System.err.println("Error: " + e.getMessage()); } }

我希望有人可以帮助我。非常感谢! :)

嘿伙计们感谢您的建议和帮助。 我这样写了

}

import java.util.HashSet; import java.util.Set; import java.io.*; import java.util.Arrays;

public class converter { public static void main(String[] args) {

try{
    //Set up Input
    FileInputStream fstream = new FileInputStream("test.txt");
    DataInputStream in = new DataInputStream(fstream);
    BufferedReader br = new BufferedReader(new InputStreamReader(in));
    String strLine;

    //Set up Output
    FileWriter ostream = new FileWriter("out.txt");
    BufferedWriter out = new BufferedWriter(ostream);

    Set lines = new HashSet();
    boolean result;

    while ((strLine = br.readLine()) != null){   
      result = lines.add(strLine.trim());
    }
    String[] newlines = new String[lines.size()];
    lines.toArray(newlines);

    Arrays.sort(newlines);

    //write it out to file in alphabetical order along with the list syntax for html
    for(int i = 0; i < lines.size(); i++)
    {
        out.write("<li>" + newlines[i] + "</li>");
        out.newLine();
    }

    out.close();
    in.close();

   }catch (Exception e){//Catch exception if any
            System.err.println("Error: " + e.getMessage());
   }
}

但感谢ewernli现在效率更高。我不知道套装,因为我刚刚参加了我的第一个Java课程而且我们没有介绍它,但它是一个很棒的功能,感谢让我熟悉它!

3 个答案:

答案 0 :(得分:1)

如果您将线条添加到Set(作为键)而不是数组,您将发现您不需要执行任何重复处理。它将为您服务,您的程序将更简单,更短。

答案 1 :(得分:1)

数组不是您想要的数据结构(您是否需要具有固定长度和排序但具有可变元素的数据结构?)。看看java.util中的集合类型。特别是,请查看SortedSetTreeSet实现。这将:

  1. 展开以保存数据
  2. 消除重复(Set
  3. 在添加内容时对其内容进行排序(请参阅Comparator实施,如String.CASE_INSENSITIVE_ORDER

答案 2 :(得分:0)

实际上你的代码需要一些改进, 但是,最让我错误的是,在使用修剪后的线条获取线条将其放入线条数组时,与未修剪的字符串进行比较。

lines[i].equals(strLine) // instead use "lines[i].equals(strLine.trim())"