我有一个缩进文件,我需要使用java解析, 我需要一些方法将它放在Section类中,如下所示
root
root1
text1
text1.1
text1.2
text2
text2.1
text2.2
root2
text1
text1.1
text1.2
text2
text2.1
text2.2.2
我有用于放置缩进的东西的类
public class Section
{
private List<Section> children;
private String text;
private int depth;
public Section(String t)
{
text =t;
}
public List<Section> getChildren()
{
if (children == null)
{
children = new ArrayList<Section>();
}
return children;
}
public void setChildren(List<Section> newChildren)
{
if (newChildren == null) {
children = newChildren;
} else {
if (children == null) {
children = new ArrayList<Section>();
}
for (Section child : newChildren) {
this.addChild(child);
}
}
}
public void addChild(Section child)
{
if (children == null) {
children = new ArrayList<Section>();
}
if (child != null) {
children.add(child);
}
}
public String getText()
{
return text;
}
public void setText(String newText)
{
text =newText;
}
public String getDepth()
{
return depth;
}
public void setDepth(int newDepth)
{
depth = newDepth;
}
}
我需要一些方法来解析文件并将其放在预期的结果中,这是一个看起来像下面的Section对象
Section=
Text="Root"
Children
Child1: Text= "root1"
Child1: "text1"
Child1="Text 1.1"
Child2="Text 1.2"
Child2: "text2"
Child1="Text 2.1"
Child2="Text 2.2"
Children
Child2: Text= "root2"
Child1: "text1"
Child1="Text 1.1"
Child2="Text 1.2"
Child2: "text2"
Child1="Text 2.1"
Child2="Text 2.2"
Here is some code that I have started
int indentCount=0;
while(String text = reader.readline()
{
indentCount=countLeadingSpaces(String word);
//TODO create the section here
}
public static int countLeadingSpaces(String word)
{
int length=word.length();
int count=0;
for(int i=0;i<length;i++)
{
char first = word.charAt(i);
if(Character.isWhitespace(first))
{
count++;
}
else
{
return count;
}
}
return count;
}
答案 0 :(得分:4)
我也添加了一个父指针。也许文本可以在没有它的情况下解析,但父指针使它更容易。首先,你需要有更多的构造函数:
static final int root_depth = 4; // assuming 4 whitespaces precede the tree root
public Section(String text, int depth) {
this.text = text;
this.depth = depth;
this.children = new ArrayList<Section>();
this.parent = null;
}
public Section(String text, int depth, Section parent) {
this.text = text;
this.depth = depth;
this.children = new ArrayList<Section>();
this.parent = parent;
}
然后,当您开始解析文件时,请逐行阅读:
Section prev = null;
for (String line; (line = bufferedReader.readLine()) != null; ) {
if (prev == null && line begins with root_depth whitespaces) {
Section root = new Section(text_of_line, root_depth);
prev = root;
}
else {
int t_depth = no. of whitespaces at the beginning of this line;
if (t_depth > prev.getDepth())
// assuming that empty sections are not allowed
Section t_section = new Section(text_of_line, t_depth, prev);
prev.addChild(t_section);
}
else if (t_depth == prev.getDepth) {
Section t_section = new Section(text_of_line, t_depth, prev.getParent());
prev.getParent().addChild(t_section);
}
else {
while (t_depth < prev.getDepth()) {
prev = prev.getParent();
}
// at this point, (t_depth == prev.getDepth()) = true
Section t_section = new Section(text_of_line, t_depth, prev.getParent());
prev.getParent().addChild(t_section);
}
}
}
我已经掩盖了伪代码的一些细节,但我认为你可以全面了解如何进行这种解析。记得实现方法addChild(),getDepth(),getParent()等。
答案 1 :(得分:3)
令人惊讶的复杂问题......但这里是伪代码
intialize a stack
push first line to stack
while (there are more lines to read) {
S1 = top of stack // do not pop off yet
S2 = read a line
if depth of S1 < depth of S2 {
add S2 as child of S1
push S2 into stack
}
else {
while (depth of S1 >= depth of S2 AND there are at least 2 elements in stack) {
pop stack
S1 = top of stack // do not pop
}
add S2 as child of S1
push S2 into stack
}
}
return bottom element of stack
其中depth是#leading whitespaces。 您可能必须修改或包装Section类以存储行的深度。
答案 2 :(得分:3)
基于this answer,我创建了一个C#解决方案。
它允许多个根并假定输入结构如:
Test
A
B
C
C1
C2
D
Something
One
Two
Three
代码的示例用法是:
var lines = new[]
{
"Test",
"\tA",
"\tB",
"\tC",
"\t\tC1",
"\t\tC2",
"\tD",
"Something",
"\tOne",
"\tTwo",
"\tThree"
};
var roots = IndentedTextToTreeParser.Parse(lines, 0, '\t');
var dump = IndentedTextToTreeParser.Dump(roots);
Console.WriteLine(dump);
您可以指定根缩进(默认为零)以及标识字符(默认情况下为标签\t
)。
完整代码:
namespace MyNamespace
{
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Text;
public static class IndentedTextToTreeParser
{
// https://stackoverflow.com/questions/21735468/parse-indented-text-tree-in-java
public static List<IndentTreeNode> Parse(IEnumerable<string> lines, int rootDepth = 0, char indentChar = '\t')
{
var roots = new List<IndentTreeNode>();
// --
IndentTreeNode prev = null;
foreach (var line in lines)
{
if (string.IsNullOrEmpty(line.Trim(indentChar)))
throw new Exception(@"Empty lines are not allowed.");
var currentDepth = countWhiteSpacesAtBeginningOfLine(line, indentChar);
if (currentDepth == rootDepth)
{
var root = new IndentTreeNode(line, rootDepth);
prev = root;
roots.Add(root);
}
else
{
if (prev == null)
throw new Exception(@"Unexpected indention.");
if (currentDepth > prev.Depth + 1)
throw new Exception(@"Unexpected indention (children were skipped).");
if (currentDepth > prev.Depth)
{
var node = new IndentTreeNode(line.Trim(), currentDepth, prev);
prev.AddChild(node);
prev = node;
}
else if (currentDepth == prev.Depth)
{
var node = new IndentTreeNode(line.Trim(), currentDepth, prev.Parent);
prev.Parent.AddChild(node);
prev = node;
}
else
{
while (currentDepth < prev.Depth) prev = prev.Parent;
// at this point, (currentDepth == prev.Depth) = true
var node = new IndentTreeNode(line.Trim(indentChar), currentDepth, prev.Parent);
prev.Parent.AddChild(node);
}
}
}
// --
return roots;
}
public static string Dump(IEnumerable<IndentTreeNode> roots)
{
var sb = new StringBuilder();
foreach (var root in roots)
{
doDump(root, sb, @"");
}
return sb.ToString();
}
private static int countWhiteSpacesAtBeginningOfLine(string line, char indentChar)
{
var lengthBefore = line.Length;
var lengthAfter = line.TrimStart(indentChar).Length;
return lengthBefore - lengthAfter;
}
private static void doDump(IndentTreeNode treeNode, StringBuilder sb, string indent)
{
sb.AppendLine(indent + treeNode.Text);
foreach (var child in treeNode.Children)
{
doDump(child, sb, indent + @" ");
}
}
}
[DebuggerDisplay(@"{Depth}: {Text} ({Children.Count} children)")]
public class IndentTreeNode
{
public IndentTreeNode(string text, int depth = 0, IndentTreeNode parent = null)
{
Text = text;
Depth = depth;
Parent = parent;
}
public string Text { get; }
public int Depth { get; }
public IndentTreeNode Parent { get; }
public List<IndentTreeNode> Children { get; } = new List<IndentTreeNode>();
public void AddChild(IndentTreeNode child)
{
if (child != null) Children.Add(child);
}
}
}
我还添加了一个方法Dump()
来将树转换回字符串,以便更好地调试算法本身。
答案 3 :(得分:1)
我使用递归函数调用实现了替代解决方案。据我所知,它的性能会比Max Seo的建议更差,特别是在深层次上。然而,它更容易理解(在我看来),因此根据您的具体需求进行修改。如果您有任何建议,请查看并告诉我。
一个好处是它可以处理具有多个根的树,就像它一样。
假设我们有一个构造节点,它可以包含数据并且具有零或更多 孩子们,也是节点。基于文本输入,我们想要构建树 节点,其中每个节点的数据是来自一行的内容,以及 树中节点的位置由行位置和缩进表示, 因此,缩进的行是第一行的子行,即 不那么缩进。
假设我们有一个行列表,定义一个函数:
使用行列表调用函数将生成树列表, 根据他们的缩进。如果树只有一个根,那么结果树将会 是结果列表的第一个元素。
List<Node> LinesToTree( List<Line> lines )
{
if(lines.count >= 2)
{
firstLine = lines.shift
nextLine = lines[0]
children = List<Line>
while(nextLine != null && firstLine.indent < nextLine.indent)
{
children.add(lines.shift)
nextLine = lines[0]
}
firstLineNode = new Node
firstLineNode.data = firstLine.data
firstLineNode.children = LinesToTree(children)
resultNodes = new List<Node>
resultNodes.add(firstLineNode)
if(lines.count > 0)
{
siblingNodes = LinesToTree(lines)
resultNodes.addAll(siblingNodes)
return resultNodes
}
else
{
return resultNodes
}
}
elseif()
{
nodes = new List<Node>
node = new Node
node.data = lines[0].data
node.children = new List<Node>
return nodes
}
else
{
return new List<Node>
}
}
可以通过委托来实现实现,以获取缩进,以及输出数组中子字段的名称。
public static function IndentedLinesToTreeArray(array $lineArrays, callable $getIndent = null, $childrenFieldName = "children")
{
//Default function to get element indentation
if($getIndent == null){
$getIndent = function($line){
return $line["indent"];
};
}
$lineCount = count($lineArrays);
if($lineCount >= 2)
{
$firstLine = array_shift($lineArrays);
$children = [];
$nextLine = $lineArrays[0];
while($getIndent($firstLine) < $getIndent($nextLine)){
$children[] = array_shift($lineArrays);
if(!isset($lineArrays[0])){
break;
}
$nextLine = $lineArrays[0];
}
$firstLine[$childrenFieldName] = self::IndentedLinesToTreeArray($children, $getIndent, $childrenFieldName);
if(count($lineArrays)){
return array_merge([$firstLine],self::IndentedLinesToTreeArray($lineArrays, $getIndent, $childrenFieldName));
}else{
return [$firstLine];
}
}
elseif($lineCount == 1)
{
$lineArrays[0][$childrenFieldName] = [];
return $lineArrays;
}
else
{
return [];
}
}