我有一个程序,接收传入的文本,将其转换为Reader类型,并返回下一个标记,即单词还是空格(非单词)。它的行为不符合预期。
为了尽可能具体,这是我使用JUnit4在Eclipse中进行的测试基础结构:
@Test
public void testGetNextTokenWord() throws IOException {
Reader in = new StringReader("Aren't you \ntired");
TokenScanner d = new TokenScanner(in);
try {
assertTrue("has next", d.hasNext());
assertEquals("Aren't", d.next());
assertTrue("has next", d.hasNext());
assertEquals(" ", d.next());
assertTrue("has next", d.hasNext());
assertEquals("you", d.next());
assertTrue("has next", d.hasNext());
assertEquals(" \n", d.next());
assertTrue("has next", d.hasNext());
assertEquals("tired", d.next());
assertFalse("reached end of stream", d.hasNext());
} finally {
in.close();
}
}
我将发布完整的代码以帮助您解决此问题,然后发布预期和观察到的行为:
//Reads as much to determine hasNext() and next()
public TokenScanner(java.io.Reader in) throws IOException {
//Throw exception if null
if (in == null) {
throw new IllegalArgumentException();
}
//Read in token
try {
System.out.println("TokenScanner!");
//Create new token scanner for argued reader
this.tokenScanner = in;
//Read next character
ch = tokenScanner.read();
}
//Throw exception if error in reading
catch (IOException e){
ch = -1;
}
}
//Determines whether the argued character is a valid word character.
public static boolean isWordCharacter(int c) {
//Cast int character to a char
char character = (char)c;
//Return true if character is valid word character
if(Character.isLetter(character) || character == '\'') {
return true;
}
//Return false otherwise
return false;
}
//Determine whether another token is avaialble
public boolean hasNext() {
//Leverage invariant
return ch != -1 ;
}
还有很多可能来自我的头痛的功能
//Determine next token
public String next() {
//End of stream reached
if(!hasNext()) {
throw new NoSuchElementException();
}
//Initialize variable to hold token
String word = "";
try {
//Character is a word character
while(isWordCharacter(ch)) {
word = word + (char)ch;
ch = tokenScanner.read();
}
//Character is a space
while(!Character.isWhitespace(ch)) {
word = word + (char)ch;
ch = tokenScanner.read();
}
System.out.println("Word is: "+ word);
return word;
}
//Exception catching
catch(Exception e) {
throw new NoSuchElementException();
}
}
根据上述测试基础结构,预期输出为:
TokenScanner!
Word is: Aren't
Word is: you
Word is: /*Not sure how to represent newline in output*/
Word is: tired
下面的实际输出是:
TokenScanner!
Word is: Aren't
Word is:
这里的问题是为什么会这样?
我的输出显示,第一个失败的测试是:
assertEquals(" ", d.next());
这里的基本问题是我如何表示非单词(空格)。最后的测试也失败了。感谢您的任何帮助!