将字符串解析为令牌

时间:2018-11-22 16:52:06

标签: java token junit4

我有一个程序,接收传入的文本,将其转换为Reader类型,并返回下一个标记,即单词还是空格(非单词)。它的行为不符合预期。

为了尽可能具体,这是我使用JUnit4在Eclipse中进行的测试基础结构:

@Test
    public void testGetNextTokenWord() throws IOException {
        Reader in = new StringReader("Aren't you \ntired"); 
        TokenScanner d = new TokenScanner(in);
        try {
            assertTrue("has next", d.hasNext());
            assertEquals("Aren't", d.next());
            assertTrue("has next", d.hasNext());
            assertEquals(" ", d.next());
            assertTrue("has next", d.hasNext());
            assertEquals("you", d.next());
            assertTrue("has next", d.hasNext());
            assertEquals(" \n", d.next());
            assertTrue("has next", d.hasNext());
            assertEquals("tired", d.next());

            assertFalse("reached end of stream", d.hasNext());
        } finally {
            in.close();
        }
    }

我将发布完整的代码以帮助您解决此问题,然后发布预期和观察到的行为:

//Reads as much to determine hasNext() and next()
    public TokenScanner(java.io.Reader in) throws IOException {

        //Throw exception if null
        if (in == null) {
            throw new IllegalArgumentException();
        }

        //Read in token
        try {   

            System.out.println("TokenScanner!");
            //Create new token scanner for argued reader
            this.tokenScanner = in;

            //Read next character
            ch = tokenScanner.read();
        }

        //Throw exception if error in reading
        catch (IOException e){
            ch = -1;
        }    
    }

//Determines whether the argued character is a valid word character.
    public static boolean isWordCharacter(int c) {

        //Cast int character to a char
        char character = (char)c;

        //Return true if character is valid word character
        if(Character.isLetter(character) || character == '\'') {
            return true;    
        }

        //Return false otherwise
        return false;
    }

//Determine whether another token is avaialble
    public boolean hasNext() {

        //Leverage invariant
        return ch != -1 ;
    }

还有很多可能来自我的头痛的功能

//Determine next token
    public String next() {

        //End of stream reached
        if(!hasNext()) {
            throw new NoSuchElementException();
        }

        //Initialize variable to hold token
        String word = "";

        try {

            //Character is a word character
            while(isWordCharacter(ch)) {
                word = word + (char)ch;
                ch = tokenScanner.read();

            }

            //Character is a space
            while(!Character.isWhitespace(ch)) {
                word = word + (char)ch;
                ch = tokenScanner.read();

            }           

            System.out.println("Word is: "+ word);
            return word;
        }

        //Exception catching
        catch(Exception e) {

            throw new NoSuchElementException();

        }   
    }

根据上述测试基础结构,预期输出为:

TokenScanner!
Word is: Aren't
Word is: you
Word is: /*Not sure how to represent newline in output*/
Word is: tired

下面的实际输出是:

TokenScanner!
Word is: Aren't
Word is:

这里的问题是为什么会这样?

我的输出显示,第一个失败的测试是:

assertEquals(" ", d.next());

这里的基本问题是我如何表示非单词(空格)。最后的测试也失败了。感谢您的任何帮助!

0 个答案:

没有答案