字频率计数器 - Java

时间:2015-10-22 11:35:12

标签: java word-count word-frequency

import java.io.EOFException;

public interface ICharacterReader {
char GetNextChar() throws EOFException;
void Dispose();
}

import java.io.EOFException;
import java.util.Random;

public class SimpleCharacterReader implements ICharacterReader {
private int m_Pos = 0;

public static final char lf = '\n';

private String m_Content = "It was the best of times, it was the worst of times," + 
lf +
"it was the age of wisdom, it was the age of foolishness," + 
lf +
"it was the epoch of belief, it was the epoch of incredulity," + 
lf +
"it was the season of Light, it was the season of Darkness," + 
lf +
"it was the spring of hope, it was the winter of despair," + 
lf +
"we had everything before us, we had nothing before us," + 
lf +
"countries it was clearer than crystal to the lords of the State" + 
lf +
"preserves of loaves and fishes, that things in general were" + 
lf +
"settled for ever";

Random m_Rnd = new Random();

public char GetNextChar() throws EOFException {

    if (m_Pos >= m_Content.length()) {
        throw new EOFException();
    }

    return m_Content.charAt(m_Pos++);

}

public void Dispose() {
    // do nothing
}
}

基本上我创建了一个名为ICharacterReader的接口,该接口获取句子中的下一个字符,并在没有其他字符时抛出异常。在它下面,我创建了一个名为SimpleCharacterReader的类,其中包含一个需要在单词频率中计算的随机句子列表。但是,现在我正在尝试创建一个单独的类,它将ICharacterReader接口作为参数,并简单地返回单词频率。我是编程的初学者,所以不确定该怎么做,任何简单的建议都会受到赞赏。

1 个答案:

答案 0 :(得分:0)

您的任务可以分两部分完成:

1。阅读char数据并将其与String

组合

只需使用StringBuilder并附加char,直到您获得例外。

ICharacterReader reader = ...
StringBuilder sb = new StringBuilder();
try{
    while (true) {
        sb.append(reader.GetNextChar());
    }
} catch (EOFException ex) {
}
String stringData = sb.toString();

2。计算单词频率

使用正则表达式简单地分割单词,然后简单地计算每个单词出现的频率。您可以使用Stream API轻松完成此操作:

Map<String, Long> frequencies = Arrays.stream(stringData.split(" +|\n"))
                                      .collect(Collectors.groupingBy(Function.identity(),
                                                                     Collectors.counting()));