在文本文件java中查找不同的标记

时间:2015-02-26 20:32:06

标签: java token identifier lexical

我正在进行一项任务,您必须使用一些代码接收文件并识别其中的标记并以特定格式输出它们。到目前为止,我已经从文件中获取了字符并将其添加到数组列表中。现在我无法想出找到文件中特定标记的逻辑。我知道你首先要做一个循环来完成数组列表。我在analyzeForTokens的评论中概述了我的逻辑。我不知道如何让它通过并只追加每种类型一次,因为一旦它检查的for循环的第一次迭代然后它再次检查第二个iteraton所以我觉得会有重叠。我该如何解决这个问题?

import java.util.ArrayList;
import java.util.Scanner;
import java.io.*;
import java.lang.Character;

public class Main {

    /*
        Constants for specific tokens that can be identified easily.
     */
    final Character LPAREN = '(';
    final Character RPAREN = ')';
    final Character ADD_OP[] = {'+', '-'};
    final String MULT_OP[] = {"*", "/", "//", "%"};
    final Character ASSIGN[] = {':', '='};
    final Character IDENTIFIERS[] ={'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'j', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z'};
    final int NUMBERS[] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};

    public static void main(String[] args) throws IOException {

        Scanner input = new Scanner(System.in); //Scanner for taking input from the user

        String fileName;
        System.out.println("Enter the name of the file.");
        fileName = input.next();

        fileExists(fileName); // Checks to see if the file exists

        ArrayList<Character> arrayOfTokens = new ArrayList<Character>();
        readToArray(arrayOfTokens, fileName);

        for(int i = 0; i < arrayOfTokens.size(); i++) {
            System.out.print(arrayOfTokens.get(i) + ", ");
        }

    }

    /*
        readToArray goes through a file and adds all its elements in individual character form. It is stored into an arraylist and it is then returned
        @param storeChar: This is an arraylist of characters that the characters will be saved into and then returned.
        @param fileName: The filename that you want to take the data from.
     */
    private static ArrayList<Character> readToArray(ArrayList<Character> storeChar, String fileName) throws IOException {
        /*
            Block of code to setup the fileInput stream to take in data from the file. Reads character by character and stores into an arraylist.
            int atChar: the current character the reader is at. Returns in int format (Need to be converted to character later on)
            int currentIndex: to add a character to an index. Increments until no more characters are left
         */
        FileInputStream fileInput = new FileInputStream(fileName);
        int atChar;
        int currentIndex = 0;

        /*
            Loop to go through and add the converted character from an int to the arraylist.
            Loops until atChar returns -1 which means no more characters in file.
         */
        while((atChar = fileInput.read()) != -1) {
            storeChar.add(currentIndex, (char)(atChar));
            currentIndex++;
        }
        fileInput.close();
        return storeChar;
    }

    /*
        fileExists method makes sure the file the user enters exists in the system. If it doesn't then the program will terminate before any further code is executed.
        @param fileName: Takes in a string paramater of the file name that you want to if it exists.
     */
    private static void fileExists(String fileName) {

        boolean ifExists; //Boolean statement that will later be set to the value of whether the file exists or not

        File file = new File(fileName);
        ifExists = file.exists();

        if(ifExists == false) {
            System.out.println("Unable to find the file. Will now close the program.");
            System.exit(0);
        }
    }

    private static ArrayList<String> analyzeForTokens(ArrayList<Character> tokens, Character LPAREN, Character RPAREN, Character ADD_OP, String MULTI_OP, Character ASSIGN, Character IDENTIFIERS, int NUMBERS) {

        ArrayList<String> indentified = new ArrayList<String>();

        for(int i = 0; i < tokens.size(); i++) { //first for loop go through the whole array list 
            //if statement to check if CURRENT character is an identifier, number, lparam, rparam etc...
                //Another loop to go until you find a white space. Then concatinate all indexes from first index to white space into new string
                //if started with indentifer param then it will take the appended string then append < identifierType >, identifierType (As long as identifier longer than one character)
                //if string only consists of one string then it is an id append <id>, id
                //if number converts the character integer and compares and so on....
            //Once returned just printout the values of the arraylist since they should be appended int he correct format

        }

        return indentified;
    }
}

编辑:输入文件和输出应该是这样的。 输入:

read a
read b
c := a + b + 3
write c 

输出:

<read>, read
<id>, a
<read>, read
<id>, b
<id>, c
<assign>, :=
<id>, a
<add_op>, +
<id>, b
<add_op>, +
<number>, 3
<write>, write
<id>, c 

0 个答案:

没有答案