Question

所以任务是读取一个具有以下名称的文件：

Alice
Bob 
James  
Richard  
Bob  
Alice  
Alice  
Alice  
James  
Richard  
Bob 
Richard  
Bob  
Stephan  
Michael  
Henry

并打印出每个名称及其出现值，例如“Alice - ＆lt; 4＆gt;”。我基本上把它弄好了。我唯一的问题是我的输出中缺少姓氏（Stephan - ＆lt; 1＆gt;），我无法让它正常工作..这可能是因为我使用[i-1]但正如我所说，我在这里找不到合适的解决方案。
好吧，这是我的代码..

package Assignment4;

import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.BufferedReader;
import java.util.Arrays;

public class ReportUniqueNames {

    public static void main(String[] args) {
        // TODO Auto-generated method stub

        System.out.println ("       This programm counts words, characters and lines!\n");
        System.out.println ("Please enter the name of the .txt file:");

        BufferedReader input = new BufferedReader(new InputStreamReader (System.in));

        BufferedReader read = null;
        String file = "";
        String text = "";
        String line = "";
        boolean unique = true;
        int nameCounter = 1;        

        try {

            file = input.readLine();
            read = new BufferedReader (new FileReader(file));
            while ((line = read.readLine()) != null) {
                text += line.trim() + " ";                  
            }

        } catch (FileNotFoundException e) {
            System.out.println("File was not found.");          
        } catch (IOException e) {
            System.out.println("An error has occured.");            
        }

        String textarray[] = text.split(" ");
        Arrays.sort(textarray);

        for (int i=0; i < textarray.length; i++) {

            if (i > 0 && textarray[i].equals(textarray[i-1])) {
                nameCounter++;
                unique = false;
            }

            if (i > 0 && !textarray[i].equals(textarray[i-1]) && !unique) {
                    System.out.println("<"+textarray[i-1]+"> - <"+nameCounter+">");
                    nameCounter = 1;
                    unique = true;
            } else if (i > 0 && !textarray[i].equals(textarray[i-1]) && unique) {
                //nameCounter = 1;
                System.out.println("<"+textarray[i-1]+"> - <"+nameCounter+">");
            }           

        }

    }

}

就是这样..希望你们中的一个可以帮助我。

编辑：哇，这么多不同的方法。
首先感谢你的帮助。
我会仔细查看你建议的解决方案，也许可以从底部重新启动;）。
当我完成后，会给你一个抬头。

Answer 1

您可以使用 Scanner 来使用新行字符作为分隔符来读取输入文件（其位置由"filepath"表示）并将这些单词直接添加到ArrayList<String>。

然后，迭代ArrayList<String>并在HashMap<String, Integer>中计算原始文件中每个单词的频率。

完整工作代码：

Scanner s = new Scanner(new File("filepath")).useDelimiter("\n");
List<String> list = new ArrayList<>();
while (s.hasNext()){
    list.add(s.next());
}
s.close();

Map<String, Integer> wordFrequency = new HashMap<>();

for(String str : list)
{
    if(wordFrequency.containsKey(str))
        wordFrequency.put(str, wordFrequency.get(str) + 1); // Increment the frequency by 1
    else
        wordFrequency.put(str, 1);
}

//Print the frequency:
for(String str : list)
{
    System.out.println(str + ": " + wordFrequency.get(str));
}

修改

或者，您可以将整个文件读入单个String ，然后使用String作为分隔符将\n的内容拆分为列表。代码比第一个选项短：

String fileContents = new Scanner(new File("filepath")).useDelimiter("\\Z").next(); // \Z is the end of string anchor, so the entire file is read in one call to next() List<String> list = Arrays.asList(fileContents.split("\\s*\\n\\s*"));// Using new line character as delimiter, it adds every word to the list

Answer 2

您可以简单地使用Map（模仿“Multiset”）来计算单词：

String textarray[] = text.split(" ");

// TreeMap gives sorting by alphabetical order "for free"
Map<String, Integer> wordCounts = new TreeMap<>();

for (int i = 0; i < textarray.length; i++) {
    Integer count = wordCounts.get(textarray[i]);
    wordCounts.put(textarray[i], count != null ? count + 1 : 1);
}

for (Map.Entry<String, Integer> e : wordCounts.entrySet()) {
    System.out.println("<" + e.getKey() + "> - <" + e.getValue() + ">");
}

Answer 3

我这样做：

Map<String,Integer> occurs = new HashMap<String,Integer>();
int i = 0, number;

for (; i < textarray.length; i++) {
   if (occurs.containsKey(textarray[i])) {
      number = occurs.get(testarray[i]);
      occurs.put(testarray[i], number + 1);
   } else {
      occurs.put(testarray[i], 1);
   }
}

for(Map.Entry<String, Integer> entry : occurs.entrySet()){
     System.out.println("<" + entry.getKey() + "> - " + entry.getValue()); 
}

Answer 4

System.out.println("<"+textarray[textarray.length-1]+"> - <"+nameCounter+">");

你的循环之后你需要这个，因为即使你的循环运行正确次数你也只打印到i-1

但使用地图是更好的选择

Answer 5

因为您的代码首次打印名称的结果时该名称不再相同。然后你缺少最后一个条目的print语句。要解决这个问题，你可以在循环结束时添加另一个if语句，检查这是否是循环的最后一次循环。 if语句如下所示：

if(i == textarray.length - 1){
    System.out.println("<"+textarray[i]+"> - <"+nameCounter+">");
}

现在循环将如下所示：

for (int i=1; i < textarray.length; i++) {

        if (i > 0 && textarray[i].equals(textarray[i-1])) {
            nameCounter++;
            unique = false;
        }
            if (i > 0 && !textarray[i].equals(textarray[i-1]) && !unique) {
                System.out.println("<"+textarray[i-1]+"> - <"+nameCounter+">");
                nameCounter = 1;
                unique = true;
            }

            else if (i > 0 && !textarray[i].equals(textarray[i-1]) && unique) {
                //nameCounter = 1;
                System.out.println("<"+textarray[i-1]+"> - <"+nameCounter+">");
            }
            if(i == textarray.length - 1){
                System.out.println("<"+textarray[i]+"> - <"+nameCounter+">");
            }
    }

现在循环还将打印列表中最后一个条目的结果。

我希望这会有所帮助：）

P.S。这里的一些其他解决方案效率更高，但这是您当前方法的解决方案。

Answer 6

首先要做的事情：您不会需要text变量，因为我们将使用更合适的数据结构替换它。您需要一个存储到目前为止在文件中找到的名称，以及您找到的每个名称的整数（出现次数）。与Dmitry一样，您可以针对此特定情况使用的最佳数据结构为Hashtable或HashMap。

假设文件结构是每行一个名称而没有任何标点符号或空格，您的代码将如下所示：

try {
    Hashtable<String,Integer> table = new Hashtable<String,Integer>();
    file = input.readLine();
    read = new BufferedReader (new FileReader(file));
    while ((line = read.readLine()) != null) {
        line.trim();
        if(table.containsKey(line))
            table.put(line, table.get(line)+1);
        else
            table.put(line, 1);                
    }
    System.out.println(table); // looks pretty good and compact on the console... :)
} catch (FileNotFoundException e) {
    System.out.println("File was not found.");          
} catch (IOException e) {
    System.out.println("An error has occured.");            
}

Answer 7

我想讨论你最初用来解决字符串数组中值的唯一性问题的逻辑。

您只是比较了数组的两个单元格，并假设它们不相等，这意味着textarray [i]的名称是唯一的！

这是假的，因为它可以在你的“唯一”布尔变量设置为true时发生。

例如：约翰|卢克|约翰|夏洛特| 比较第一个和第二个会给你john和luke都是不相等的，再次比较它们也会说当循环的“i”前进时它们也是不相等的，但这不是事实。

所以让我们想象一下我们在java中没有地图，如何用算法解决这个问题？

我会帮助你一个想法。

1 - 创建一个函数，该函数接受要验证的字符串和表的参数 2-然后循环所有表测试，如果字符串等于当前表的单元格，如果是，则返回null或-1 3-如果你完成循环表直到数组的最后一个单元格，这意味着你的字符串是唯一的只是在屏幕上打印。 4-调用此函数textarray.length次并且您将在屏幕上只显示唯一的名称。

列表名称和出现次数

7 个答案: