如何在Java中读取树结构选项卡中的txt文件

时间:2019-05-10 20:39:14

标签: java tree export-to-csv tab-delimited

我正在尝试读取一个具有树结构且带有制表符和标签的.txt文件 我想将其转换为.csv

Category
  Subcategory
     Subcategory1
        Subcategory11
            Item1
            Item2     
        Subcategory12
            Item1
        Subcategory13
            Item1
                Item11

我想用结构创建一个.csv文件

Category, Subcategory,Subcategory1, Subcategory11,Item1
Category, Subcategory,Subcategory1, Subcategory11,Item2 
Category, Subcategory,Subcategory1, Subcategory12,Item1
Category, Subcategory,Subcategory1, Subcategory13,Item1,Item11

到目前为止,我所做的是

public static void main(String[] args) throws IOException {
    Scanner keywords = new Scanner(new File("keywords.txt"));

     ArrayList<ArrayList<String>> keywordsList = new ArrayList<ArrayList<String>>();
     ArrayList<String> newline = new ArrayList<String>();
        while(keywords.hasNext()){
            String line = keywords.nextLine();
            String[] tokens = line.split("\t");
            for(int i=0; i<tokens.length; i++){

                    if(tokens[i] != null && !tokens[i].isEmpty()){
                        newline.add(tokens[i]);
                    }
            }

            keywordsList.add(newline);

        }

}

3 个答案:

答案 0 :(得分:1)

这应该可以工作(警告:如果输入意外,它可能会失败,即一行的标签数比以前多了2个标签):

    Scanner keywords = new Scanner(new File("keywords.txt"));

    ArrayList<String> stack = new ArrayList<String>();
    ArrayList<String> csvLines = new ArrayList<String>();

    // stores the number of elements of the last line processed
    int lastSize = -1;

    while (keywords.hasNext()) {
        String line = keywords.nextLine();

        int tabs = 0;
        // Count tabs of current line
        while (line.length() > tabs // to avoid IndexOutOfBoundsException in charAt()
                && line.charAt(tabs) == '\t') {
            tabs++;
        }

        line = line.substring(tabs); // delete the starting tabs

        if (tabs <= lastSize) {
            // if the current line has the same number of elements than the previous line, 
            // then we can save the previous processed line as CSV 
            String csvLine = "";
            for (String element : stack) {
                if (csvLine.length() > 0) {
                    csvLine += ", ";
                }
                csvLine += element;
            }
            csvLines.add(csvLine);
        }

        // if the current line has less tabs than the previous, then cut the stack 
        for (int i = stack.size() - 1; i >= tabs; i--) {
            stack.remove(i);
        }

        // if the current line has more tabs than the previous, then add the new element to the stack
        if (tabs >= stack.size()) {
            stack.add(line);
        }

        // save the number of tabs of the current line
        lastSize = tabs;
    }
    keywords.close();

    // we have to save the last line processed
    if (lastSize >= 0) {
        // save line
        String csvLine = "";
        for (String element : stack) {
            if (csvLine.length() > 0) {
                csvLine += ", ";
            }
            csvLine += element;
        }
        csvLines.add(csvLine);
    }

    // print out CSV
    for (String string : csvLines) {
        System.out.println(string);
    }

答案 1 :(得分:1)

我基于每行中文件中单词的空格/缩进创建了一个非常基本的Tree-node结构,下面是代码(希望注释和变量名不言自明)。附注:我已经使用Files.readAllLines将整个内容读取到一个列表中。

import java.io.File;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.util.ArrayList;
import java.util.List;

public class Sample {

    public static void main(String[] args) throws IOException {
        File file = new File("C:\\Users\\Untitled.txt");
        List<String> lines = Files.readAllLines(file.toPath(), StandardCharsets.UTF_8);

        Node root = new Node(lines.get(0));
        root.parent = null; 
        Node currentNode = root;
        for(int i=1; i<lines.size(); i++) {
            int cCount = lines.get(i).length()-lines.get(i).trim().length();
            int pCount = lines.get(i-1).length()-lines.get(i-1).trim().length();
            if(cCount > pCount) { //if spaces are more than previous add child node
                Node node = new Node(lines.get(i).trim());
                node.parent = currentNode;
                currentNode.childrens.add(node);
                currentNode = node;
            }
            else if(cCount == pCount) {//if spaces are same add node on same level
                Node node = new Node(lines.get(i).trim());
                currentNode.parent.childrens.add(node);
                node.parent=currentNode.parent;
            }
            else if(cCount < pCount) {//if spaces are less then add node to parent of parent
                Node node = new Node(lines.get(i).trim());
                currentNode.parent.parent.childrens.add(node);
                node.parent= currentNode.parent.parent;
                currentNode = node;
            }
        }
        String result = root.name;
        createResultString(root, result);
    }

    private static void createResultString(Node root, String result) {
        for(int i=0; i<root.childrens.size(); i++) {
            Node node = root.childrens.get(i);
            String newResult = result+" , "+ node.name;
            if(!node.childrens.isEmpty()) { //recursive search for children node name
                createResultString(node, newResult);
            }else {
                System.out.println(newResult); //**This is your csv data**
            }
        }
    }

    //Sample TreeNode to hold structure
    static class Node{
        Node(String word){
            this.name = word;
        }
        String name;
        List<Node> childrens = new ArrayList<Sample.Node>();
        Node parent;        
    }
}

输出将为

Category , Subcategory , Subcategory1 , Subcategory11 , Item1
Category , Subcategory , Subcategory1 , Subcategory11 , Item2
Category , Subcategory , Subcategory1 , Subcategory12 , Item1
Category , Subcategory , Subcategory1 , Subcategory13 , Item1 , Item11

答案 2 :(得分:0)

我知道这不能直接回答您的问题,但是您正在解析文档,Connect: Extracting Data with mapStateToProps是您解析文档的理想起点。