我正在尝试读取一个具有树结构且带有制表符和标签的.txt
文件
我想将其转换为.csv
。
Category
Subcategory
Subcategory1
Subcategory11
Item1
Item2
Subcategory12
Item1
Subcategory13
Item1
Item11
我想用结构创建一个.csv
文件
Category, Subcategory,Subcategory1, Subcategory11,Item1
Category, Subcategory,Subcategory1, Subcategory11,Item2
Category, Subcategory,Subcategory1, Subcategory12,Item1
Category, Subcategory,Subcategory1, Subcategory13,Item1,Item11
到目前为止,我所做的是
public static void main(String[] args) throws IOException {
Scanner keywords = new Scanner(new File("keywords.txt"));
ArrayList<ArrayList<String>> keywordsList = new ArrayList<ArrayList<String>>();
ArrayList<String> newline = new ArrayList<String>();
while(keywords.hasNext()){
String line = keywords.nextLine();
String[] tokens = line.split("\t");
for(int i=0; i<tokens.length; i++){
if(tokens[i] != null && !tokens[i].isEmpty()){
newline.add(tokens[i]);
}
}
keywordsList.add(newline);
}
}
答案 0 :(得分:1)
这应该可以工作(警告:如果输入意外,它可能会失败,即一行的标签数比以前多了2个标签):
Scanner keywords = new Scanner(new File("keywords.txt"));
ArrayList<String> stack = new ArrayList<String>();
ArrayList<String> csvLines = new ArrayList<String>();
// stores the number of elements of the last line processed
int lastSize = -1;
while (keywords.hasNext()) {
String line = keywords.nextLine();
int tabs = 0;
// Count tabs of current line
while (line.length() > tabs // to avoid IndexOutOfBoundsException in charAt()
&& line.charAt(tabs) == '\t') {
tabs++;
}
line = line.substring(tabs); // delete the starting tabs
if (tabs <= lastSize) {
// if the current line has the same number of elements than the previous line,
// then we can save the previous processed line as CSV
String csvLine = "";
for (String element : stack) {
if (csvLine.length() > 0) {
csvLine += ", ";
}
csvLine += element;
}
csvLines.add(csvLine);
}
// if the current line has less tabs than the previous, then cut the stack
for (int i = stack.size() - 1; i >= tabs; i--) {
stack.remove(i);
}
// if the current line has more tabs than the previous, then add the new element to the stack
if (tabs >= stack.size()) {
stack.add(line);
}
// save the number of tabs of the current line
lastSize = tabs;
}
keywords.close();
// we have to save the last line processed
if (lastSize >= 0) {
// save line
String csvLine = "";
for (String element : stack) {
if (csvLine.length() > 0) {
csvLine += ", ";
}
csvLine += element;
}
csvLines.add(csvLine);
}
// print out CSV
for (String string : csvLines) {
System.out.println(string);
}
答案 1 :(得分:1)
我基于每行中文件中单词的空格/缩进创建了一个非常基本的Tree-node结构,下面是代码(希望注释和变量名不言自明)。附注:我已经使用Files.readAllLines将整个内容读取到一个列表中。
import java.io.File;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.util.ArrayList;
import java.util.List;
public class Sample {
public static void main(String[] args) throws IOException {
File file = new File("C:\\Users\\Untitled.txt");
List<String> lines = Files.readAllLines(file.toPath(), StandardCharsets.UTF_8);
Node root = new Node(lines.get(0));
root.parent = null;
Node currentNode = root;
for(int i=1; i<lines.size(); i++) {
int cCount = lines.get(i).length()-lines.get(i).trim().length();
int pCount = lines.get(i-1).length()-lines.get(i-1).trim().length();
if(cCount > pCount) { //if spaces are more than previous add child node
Node node = new Node(lines.get(i).trim());
node.parent = currentNode;
currentNode.childrens.add(node);
currentNode = node;
}
else if(cCount == pCount) {//if spaces are same add node on same level
Node node = new Node(lines.get(i).trim());
currentNode.parent.childrens.add(node);
node.parent=currentNode.parent;
}
else if(cCount < pCount) {//if spaces are less then add node to parent of parent
Node node = new Node(lines.get(i).trim());
currentNode.parent.parent.childrens.add(node);
node.parent= currentNode.parent.parent;
currentNode = node;
}
}
String result = root.name;
createResultString(root, result);
}
private static void createResultString(Node root, String result) {
for(int i=0; i<root.childrens.size(); i++) {
Node node = root.childrens.get(i);
String newResult = result+" , "+ node.name;
if(!node.childrens.isEmpty()) { //recursive search for children node name
createResultString(node, newResult);
}else {
System.out.println(newResult); //**This is your csv data**
}
}
}
//Sample TreeNode to hold structure
static class Node{
Node(String word){
this.name = word;
}
String name;
List<Node> childrens = new ArrayList<Sample.Node>();
Node parent;
}
}
输出将为
Category , Subcategory , Subcategory1 , Subcategory11 , Item1
Category , Subcategory , Subcategory1 , Subcategory11 , Item2
Category , Subcategory , Subcategory1 , Subcategory12 , Item1
Category , Subcategory , Subcategory1 , Subcategory13 , Item1 , Item11
答案 2 :(得分:0)
我知道这不能直接回答您的问题,但是您正在解析文档,Connect: Extracting Data with mapStateToProps是您解析文档的理想起点。