Question

程序说明：

我有我的这个程序，旨在阅读每个单词一个文件（大一个），然后检查单词是否已经存在于保持独特单词的单词数组。如果没有，请添加单词到数组的末尾，并向uniquewordcounter添加+1以及到同一索引处的计数数组。如果已找到该单词在数组的某个地方，它应该找到索引号，并在计数数组中的相同索引号将值增加为1.这它应该在文件有更多内容时执行。我也不被允许使用HashMaps。

然而，我的程序会在读取文件时进入无限循环，并且眨眼间唯一字的数量很容易超过100.000，但它应该最大为5000 ...

以下是代码：

class Oblig3A{
    public static void main(String[]args){

    OrdAnalyse oa = new OrdAnalyse();
    String filArgs=args[0];
    oa.analyseMetode(filArgs);
    }
}

class OrdAnalyse{
    void analyseMetode(String filArgs){

    //Begins with naming all of the needed variables
    Scanner input, innfil;
    String[] ord, fortelling;
    int[] antall;
    int antUnikeOrd, totalSum;
    PrintWriter utfil;

    //Declaring most of them.
    input=new Scanner(System.in);
    ord=new String[5000];
    antall=new int[5000];
    antUnikeOrd=0;
    totalSum=0;
    try{
        innfil=new Scanner(new File(filArgs));



    //The problem is located here somewhere:
        while(innfil.hasNext()){
        fortelling=innfil.nextLine().toLowerCase().split(" ");

        ord[0]=innfil.next().toLowerCase();

            for(int i=0; i<fortelling.length; i++){
            for(int j=0; j<5000; j++){
            if(fortelling[i].equals(ord[j])){
                antall[j]+=1;
                System.out.print("heo");
            }else{
                ord[j]=fortelling[i];
                antall[j]+=1;
                antUnikeOrd+=1;
                }
            System.out.println(ord.length);
            System.out.println(antUnikeOrd);

            }
        }
        }
        innfil.close();
    }catch(Exception e){
        e.printStackTrace();
    }

   // Here the program will write all the info acquired above into a file called Oppsummering.txt, which it will make.
    try{
        utfil=new PrintWriter(new File("Oppsummering.txt"));

        for(int i=0; i<antall.length; i++){
        totalSum+=antall[i];
        }

        utfil.println("Antall ord lest: " +totalSum+ " og antall unike ord: "+antUnikeOrd);

        for(int i=0; i<ord.length; i++){

        utfil.println(ord[i]+("  ")+antall[i]);
        }
        utfil.close();
    }catch(Exception e){
        e.printStackTrace();
    }
    }
}

Answer 1

/The problem is located here somewhere:
    Scanner keepTrack=infill.next();
    while(keepTrack.next().Equals(null)){
    fortelling=innfil.nextLine().toLowerCase().split(" ");

    ord[0]=innfil.next().toLowerCase();

        for(int i=0; i<fortelling.length; i++){
        for(int j=0; j<5000; j++){
        if(fortelling[i].equals(ord[j])){
            antall[j]+=1;
            System.out.print("heo");
        }else{
            ord[j]=fortelling[i];
            antall[j]+=1;
            antUnikeOrd+=1;
            }
        System.out.println(ord.length);
        System.out.println(antUnikeOrd);

        }
    }
    infill=infill.next();
    keepTrack=infill;
    }
    innfil.close();
}

尝试这个我不确定它是否有效！

我认为问题在于你只在一个元素上循环而不是在所有元素上循环。

祝你好运!!!

Answer 2

我没有直接回答您的问题，但我为您提供了更简单的解决方案。我必须承认我很懒，分析你的代码对于像我这样的人来说很多：）部分因为它不是英文的，部分是因为如果使用了正确的容器，代码可能会简单得多。我用较小的文件测试了你的代码，它也永远循环，因此大小并不重要。

正如我所说，如果使用适当的容器，可以做得更简单。所以这是我的解决方案：

    Map<String, Integer> wordsMap = new HashMap<String, Integer>();

    Scanner scanner = new Scanner(new File("C:\\temp\\input.txt"));
    while(scanner.hasNext()){
        String word = scanner.next();
        wordsMap.put(word ,wordsMap.containsKey( word ) ? wordsMap.get( word ) + 1 : 1);
    }

    System.out.println("Total number of unique words: "+wordsMap.size());
    for( String word : wordsMap.keySet()){
        System.out.println("Word \""+word+"\" occurs "+wordsMap.get(word)+" times.");
    }

计数逻辑处于while循环中。在for循环中进行打印，您可以使用文件更改系统输出，您应该没问题

Answer 3

这里有一些不同的问题阻止您的程序按预期工作。首先，您使用扫描仪并未提供您可能期望的结果。假设我们有一个非常简单的输入文件，如下所示：

apple banana carrot
alligator baboon crocodile

首先，扫描仪位于文件的开头，如下所示：

|apple banana carrot
alligator baboon crocodile

当您致电.nextLine()时，扫描仪会将光标前进到行尾并返回它传递的所有数据。因此fortelling设置为["apple", "banana", "carrot"]，扫描仪位于第二行的开头，如下所示：

apple banana carrot
|alligator baboon crocodile

因此，当您致电.next()时，ord[0]将设置为“鳄鱼”并再次移动光标。扫描仪不可重绕，因此如果您使用下一个...方法之一读取了一些数据，则无法使用相同的扫描仪再次读取数据。

你的第二个问题是循环中的逻辑。 fortelling[i].equals(ord[j])将始终评估为false，因为fortelling中的所有字符串都不是“alligator”。因此，始终执行以下行：

ord[j]=fortelling[i];
antall[j]+=1;
antUnikeOrd+=1;

由于你的内部循环，对于文件第一行中的每个单词，这些行将重复5000次。因此，在外部循环的第一次迭代之后，变量将如下所示：

ord : [ "apple", "apple", "apple", "apple", "apple", ... ]
antall : [ 1, 1, 1, 1, 1, ... ]
antUnikeOrd : 5000

在第二个之后它将是：

ord : [ "banana", "banana", "banana", "banana", "banana", ... ]
antall : [ 2, 2, 2, 2, 2, ... ]
antUnikeOrd : 10000

然后：

ord : [ "carrot", "carrot", "carrot", "carrot", "carrot", ... ]
antall : [ 2, 2, 2, 2, 2, ... ]
antUnikeOrd : 15000

这就是为什么你的独特单词增加得如此之快。对于您处理的每个单词，都会添加5000。即使扫描仪问题不存在，此处的逻辑也不正确。如果单词与现有单词匹配，则只需要执行一次操作，而不是5000次。一个位置很好的break语句可能会解决这个问题。

此外，您正在使用while循环的每次迭代更改ord[0]的值。如果该数组应该是唯一单词列表，则这可能不正确。 ord中的每个项目都应设置一次且仅一次。

我并不是故意要成为一个大型的代码审查，但你去了。我希望你觉得它很有用！

无限的while循环以及读取到文件的问题

3 个答案: