我之前已经问过这个问题(Counting distinct words with Threads)并且使代码更合适。如第一个问题所述,我需要计算文件中的不同单词。
De-Bug显示我的所有单词都被正确存储和排序,但现在的问题是Test类中的一个无限“while”循环,在读完所有单词后继续进行(De-bug确实有助于弄清楚)一些要点......)。 我现在在一个小文件上测试代码,不超过10个单词。
DataSet类主要经过修改。
我需要一些建议如何摆脱循环。
测试看起来像这样:
package test;
import java.io.File;
import java.io.IOException;
import junit.framework.Assert;
import junit.framework.TestCase;
import main.DataSet;
import main.WordReader;
public class Test extends TestCase
{
public void test2() throws IOException
{
File words = new File("resources" + File.separator + "test2.txt");
if (!words.exists())
{
System.out.println("File [" + words.getAbsolutePath()
+ "] does not exist");
Assert.fail();
}
WordReader wr = new WordReader(words);
DataSet ds = new DataSet();
String nextWord = wr.readNext();
// This is the loop
while (nextWord != "" && nextWord != null)
{
if (!ds.member(nextWord))
{
ds.insert(nextWord);
}
nextWord = wr.readNext();
}
wr.close();
System.out.println(ds.toString());
System.out.println(words.toString() + " contains " + ds.getLength()
+ " distinct words");
}
}
这是我更新的DataSet类,尤其是member()方法,我仍然不确定它,因为在某些时候我曾经得到一个NullPointerExeption(不知道为什么......):
package main;
import sort.Sort;
public class DataSet
{
private String[] data;
private static final int DEFAULT_VALUE = 200;
private int nextIndex;
private Sort bubble;
public DataSet(int initialCapacity)
{
data = new String[initialCapacity];
nextIndex = 0;
bubble = new Sort();
}
public DataSet()
{
this(DEFAULT_VALUE);
nextIndex = 0;
bubble = new Sort();
}
public void insert(String value)
{
if (nextIndex < data.length)
{
data[nextIndex] = value;
nextIndex++;
bubble.bubble_sort(data, nextIndex);
}
else
{
expandCapacity();
insert(value);
}
}
public int getLength()
{
return nextIndex + 1;
}
public boolean member(String value)
{
for (int i = 0; i < data.length; i++)
{
if (data[i] != null && nextIndex != 10)
{
if (data[i].equals(value))
return true;
}
}
return false;
}
private void expandCapacity()
{
String[] larger = new String[data.length * 2];
for (int i = 0; i < data.length; i++)
{
data = larger;
}
}
}
WordReader类没有太大变化。 ArrayList被简单数组所取代,存储方法也被修改了:
package main;
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
public class WordReader
{
private File file;
private String[] words;
private int nextFreeIndex;
private BufferedReader in;
private int DEFAULT_SIZE = 200;
private String word;
public WordReader(File file) throws IOException
{
words = new String[DEFAULT_SIZE];
in = new BufferedReader(new FileReader(file));
nextFreeIndex = 0;
}
public void expand()
{
String[] newArray = new String[words.length * 2];
// System.arraycopy(words, 0, newArray, 0, words.length);
for (int i = 0; i < words.length; i++)
newArray[i] = words[i];
words = newArray;
}
public void read() throws IOException
{
}
public String readNext() throws IOException
{
char nextCharacter = (char) in.read();
while (in.ready())
{
while (isWhiteSpace(nextCharacter) || !isCharacter(nextCharacter))
{
// word = "";
nextCharacter = (char) in.read();
if (!in.ready())
{
break;
}
}
word = "";
while (isCharacter(nextCharacter))
{
word += nextCharacter;
nextCharacter = (char) in.read();
}
storeWord(word);
return word;
}
return word;
}
private void storeWord(String word)
{
if (nextFreeIndex < words.length)
{
words[nextFreeIndex] = word;
nextFreeIndex++;
}
else
{
expand();
storeWord(word);
}
}
private boolean isWhiteSpace(char next)
{
if ((next == ' ') || (next == '\t') || (next == '\n'))
{
return true;
}
return false;
}
private boolean isCharacter(char next)
{
if ((next >= 'a') && (next <= 'z'))
{
return true;
}
if ((next >= 'A') && (next <= 'Z'))
{
return true;
}
return false;
}
public boolean fileExists()
{
return file.exists();
}
public boolean fileReadable()
{
return file.canRead();
}
public Object wordsLength()
{
return words.length;
}
public void close() throws IOException
{
in.close();
}
public String[] getWords()
{
return words;
}
}
并且已为字符串更改了Bubble Sort类:
package sort;
public class Sort
{
public void bubble_sort(String a[], int length)
{
for (int j = 0; j < length; j++)
{
for (int i = j + 1; i < length; i++)
{
if (a[i].compareTo(a[j]) < 0)
{
String t = a[j];
a[j] = a[i];
a[i] = t;
}
}
}
}
}
答案 0 :(得分:0)
我认为实际阻止的方法是WordReader.readNext()
。我的建议是你使用Scanner
而不是BufferedReader
,它更适合将文件解析为单词。
您的readNext()
方法可以重做(扫描是扫描仪):
public String readNext() {
if (scan.hasNext()) {
String word = scan.next();
if (!word.matches("[A-Za-z]+"))
word = "";
storeWord(word);
return word;
}
return null;
}
这将与您的代码具有相同的功能(不使用isCharacter()
或isWhitespace()
- 正则表达式(内部matches()
)检查单词是否仅包含字符。{{1} } {function}内置于isWhitespace()
方法中,用于分隔单词。添加的功能是当文件中没有单词时返回null。
您必须在Test类中更改while循环才能使其正常工作,否则您将获得next()
- 只需在循环定义中切换两个条件(始终检查null,或者首先会给NPE一种方式,而无效检查也是无用的。)
要制作扫描仪,您可以直接使用NullPointerException
作为参数,也可以直接使用BufferedReader
:
File