拆分大文本文件以形成表格

时间:2014-01-07 13:45:13

标签: c# regex loops logic

我有一个文本文件,我要读取文本文件然后我要以表格的形式转换文件数据。 该文件采用此格式

{KeyValuePair}
{
    Key1 = Value1 {next}
    Key2 = Value2 {next}
    Key3 = Value3 {next}
    Key4 = {KeyValuePair}   {
                                KeyA = ValueA {next}
                                KeyB = ValueB {next}
                                KeyC = ValueC {next}
                            }
}

我需要像这样的输出

enter image description here

我的逻辑代码在这里

StreamReader reader = new StreamReader("C:\\Users\\Kaushik Kishore\\Documents\\TextToRead.txt");
            string data = reader.ReadToEnd();
            //string[] stringSeparater = new string[] { "{KeyValuePair}" };
            //string[] getData = data.Split(stringSeparater, StringSplitOptions.None);
            //string[] separater = new string[] { "{next}" };
            //string[] nextSplit = data.Split(separater, StringSplitOptions.None);
            string pattern = @"(=)|(next)|(KeyValuePair)|({)|(})";
            string[] output = Regex.Split(data, pattern);
            foreach (string one in output)
            {
                Response.Write(one);

            }

所以我面临的问题是如何编写实际逻辑来提取所需的字符串。 Next指定我们必须更改表中的行。每次下一个关键字出现时,我都要在新行中发布数据。 在此先感谢

修改 我已经做了一些努力并编写了一些代码 现在打印数据很好我想知道如何将数据从控制器传递到视图。当数据进入循环部分时。

public ActionResult Index()
        {


            StreamReader reader = new StreamReader("C:\\Users\\Kaushik Kishore\\Documents\\Text2.txt");
            string data = reader.ReadToEnd();
            // replacing all tabs white space new line and everything
            string trimmedData = Regex.Replace(data, @"\s", "");
            string pattern = @"({next})|({KeyValuePair}{)|(}{next})";
            string[] output = Regex.Split(trimmedData, pattern);
            int length = output.Length;
            int count = 0;
            foreach (string one in output)
            {
                count++;
                if (one == "{KeyValuePair}{")
                {
                    Response.Write("Table Create</br>");
                }
                else if (count == length)
                {
                    string[] last = one.Split('=');
                    foreach (string lastVal in last)
                    {
                        Response.Write(lastVal.Substring(0,lastVal.Length-1));
                        Response.Write('|');
                    }
                }
                else
                {                                        
                    string[] keyVal = one.Split('=');
                    foreach (string val in keyVal)
                    {
                        if (val == "{next}")
                        {
                            Response.Write("</br>");                            
                        }
                        else if (val == "}{next}")
                        {
                            Response.Write("Subtable End</br>");                            
                        }
                        else if (val == "}")
                        {
                            Response.Write("");                            
                        }
                        else
                        {
                            Response.Write(val);
                            Response.Write("|");
                        }
                    }
                }
            }
            reader.Close();
            return View();
        }

3 个答案:

答案 0 :(得分:0)

如果您使用这个小模式,并且如果您在值捕获组上递归使用它,我认为您可以获得您想要的:

string pattern = @"(?>\s*(?<key>[^\s=]+)\s*=\s*|^\s*)(?>{KeyValuePair}\s*{\s*(?<value>(?>(?<c>{)|(?<-c>})|[^{}]+)+(?(c)(?!)))\s*}|(?<value>[^\s{]+)\s*(?<next>{next})\s*)";

模式细节:

(?>                            # possible begin of the match
    \s*(?<key>[^\s=]+)\s*=\s*    # a keyname
  |                             # OR
    ^\s*                         # the start of the string
)

(?>
    # case KeyValuePair #
    {KeyValuePair} \s* { \s*  
    (?<value> 
        (?>(?<c>{)|(?<-c>})|[^{}]+)+ (?(c)(?!)) # content inside balanced curly brackets*
    )
    \s* }
  |    OR
    # case next #
    (?<value>
        [^\s{]+  # all that is not a white character or an opening curly bracket
    )
    \s*
    (?<next> {next} )\s* # the goal of this capture is to know in which case you are
)

(*)您可以在此处找到有关平衡组的更多说明:What are regular expression Balancing Groups?

这个想法是编写一个递归方法,当模式匹配“keyValuePair”情况时,该方法将调用自身。在“下一个”情况下,该方法仅记录数组中的键/值(或此类结构)。该方法必须返回这种数组。

答案 1 :(得分:0)

我创建了一个基于解析器的解决方案,输出包含键值对的字典。它可以根据需要嵌套{KeyValuePair}

像这样使用:

string data = File.ReadAllText("data.txt");
var p = new Parser(text);
Dictionary<string, Value> dictionary = p.Parse();

值可以是字符串或字典:

public abstract class Value { }

public class StringValue : Value
{
    public string Value { get; private set; }
    public StringValue(string value)
    {
        this.Value = value;
    }
}

public class DictionaryValue : Value
{
    public Dictionary<string, Value> Values { get; private set; }
    public DictionaryValue()
    {
        this.Values = new Dictionary<string, Value>();
    }
}

这允许错误报告:

public class ParseError : Exception
{
    public ParseError(string message)
        : base(message) { }
}

解析器包含两件事。 tokenizer,它将输入文本转换为标记流:

  

KeyValuePair,OpenBracket,KeyOrValue(Key1),Assign,   KeyOrValue(Value1),Next,KeyOrValue(Key2),Assign,   KeyOrValue(Value2),Next,KeyOrValue(Key3),Assign,   KeyOrValue(Value3),Next,KeyOrValue(Key4),Assign,KeyValuePair,   OpenBracket,KeyOrValue(KeyA),Assign,KeyOrValue(ValueA),Next,   KeyOrValue(KeyB),Assign,KeyOrValue(ValueB),Next,KeyOrValue(KeyC)   ,Assign,KeyOrValue(ValueC),Next,CloseBracket,CloseBracket,End

然后是解析器,它将令牌流转换为字典。

以下是完整的代码:

public class Parser
{
    private Tokenizer tk;
    public Parser(string text)
    {
        this.tk = new Tokenizer(text);
    }
    public Dictionary<string, Value> Parse()
    {
        Stack<Dictionary<string, Value>> dictionaries = new Stack<Dictionary<string, Value>>();

        Token t;

        while ((t = tk.ReadToken()) != Token.End)
        {
            switch (t)
            {
                case Token.KeyValuePair:
                    t = tk.ReadToken();
                    if (t != Token.OpenBracket)
                        throw new ParseError("{KeyValuePair} should be followed by a '{'");
                    dictionaries.Push(new Dictionary<string, Value>());
                    break;
                case Token.CloseBracket:
                    if (dictionaries.Count > 1)
                        dictionaries.Pop();
                    break;
                case Token.KeyOrValue:
                    string key = tk.TokenValue;
                    t = tk.ReadToken();
                    if (t != Token.Assign)
                        throw new ParseError("Key should be followed by a '='");
                    t = tk.ReadToken();
                    if (t == Token.KeyValuePair)
                    {
                        var value = new DictionaryValue();
                        dictionaries.Peek().Add(key, value);
                        dictionaries.Push(value.Values);
                    }
                    else if (t != Token.KeyOrValue)
                        throw new ParseError("Value expected after " + key + " =");
                    else
                    {
                        string value = tk.TokenValue;
                        dictionaries.Peek().Add(key, new StringValue(value));
                        t = tk.ReadToken();
                        if (t != Token.Next)
                            throw new ParseError("{next} expected after Key value pair (" + key + " = " + value + ")");
                    }
                    break;
                case Token.Error:
                    break;
                default:
                    break;
            }
        }
        return dictionaries.Peek();
    }

    private class Tokenizer
    {
        private string _data;
        private int currentIndex = 0;
        private string tokenValue;

        public string TokenValue
        {
            get { return tokenValue; }
        }

        public Tokenizer(string data)
        {
            this._data = data;
        }

        public Token ReadToken()
        {
            tokenValue = string.Empty;
            if (currentIndex >= _data.Length) return Token.End;

            char c = _data[currentIndex];
            if (char.IsWhiteSpace(c))
            {
                currentIndex++;
                return ReadToken();
            }
            else if (c == '{')
            {
                if (TryReadBracketedToken("KeyValuePair"))
                {
                    currentIndex++;
                    return Token.KeyValuePair;
                }
                else if (TryReadBracketedToken("next"))
                {
                    currentIndex++;
                    return Token.Next;
                }
                else
                {
                    currentIndex++;
                    return Token.OpenBracket;
                }
            }
            else if (c == '}')
            {
                currentIndex++;
                return Token.CloseBracket;
            }
            else if (c == '=')
            {
                currentIndex++;
                return Token.Assign;
            }
            else
            {
                StringBuilder valueBuilder = new StringBuilder();
                while (currentIndex < _data.Length && !char.IsWhiteSpace(c))
                {
                    valueBuilder.Append(c);
                    currentIndex++;
                    c = _data[currentIndex];
                }
                tokenValue = valueBuilder.ToString();
                return Token.KeyOrValue;
            }
        }

        private bool TryReadBracketedToken(string token)
        {
            bool result = _data.Length > currentIndex + token.Length + 2
                        && _data.Substring(currentIndex + 1, token.Length + 1) == token + "}";
            if (result)
            {
                currentIndex++;
                currentIndex += token.Length;
            }
            return result;
        }
    }

    private enum Token
    {
        KeyValuePair,
        Next,
        OpenBracket,
        CloseBracket,
        Assign,
        KeyOrValue,
        End,
        Error
    }
}

public abstract class Value { }

public class StringValue : Value
{
    public string Value { get; private set; }
    public StringValue(string value)
    {
        this.Value = value;
    }
}

public class DictionaryValue : Value
{
    public Dictionary<string, Value> Values { get; private set; }
    public DictionaryValue()
    {
        this.Values = new Dictionary<string, Value>();
    }
}

public class ParseError : Exception
{
    public ParseError(string message)
        : base(message) { }
}

答案 2 :(得分:0)

这将是您的控制器部分

public ActionResult Index()
    {
        ViewBag.DisplayTable = GetKeyValueDisplayContent(@"YourFilePath.Txt");
        return View();
    }

    private string GetKeyValueDisplayContent(string fileToRead)
    {
        // 01 Get Data
        string DataToProcess = GetDataToProcess(fileToRead);

        // 02 Cleaning Data (replacing all tabs white space new line and everything)
        DataToProcess = CleanDataToProcess(DataToProcess);

        // 03 Retrieve Array from Data format
        string[] output = GetDataInArray(DataToProcess);

        // 04 Displaying Result
        string DrawTable = GetDisplayHTML(output);
        return DrawTable;

    }

    private string GetDataToProcess(string fileToRead)
    {

        StreamReader reader = new StreamReader(fileToRead);
        string data = reader.ReadToEnd();
        reader.Close();

        return data;
    }

    private string CleanDataToProcess(string dataToProcess)
    {
        return Regex.Replace(dataToProcess, @"\s", "");
    }

    private string[] GetDataInArray(string dataToProcess)
    {
        string pattern = @"({next})|({KeyValuePair}{)|(}{next})";
        string[] output = Regex.Split(dataToProcess, pattern);

        return output;
    }

    private string GetDisplayHTML(string[] output)
    {
        int length = output.Length;
        int count = 0;
        StringBuilder OutputToPrint = new StringBuilder();


        foreach (string one in output)
        {

            if (one == "{KeyValuePair}{")
            {
                count++;
                if (count >= 2)
                {
                    OutputToPrint.Append("<td><table border = \"1\">");
                }
                else
                {
                    OutputToPrint.Append("<table border = \"1\">");
                }
            }
            else if (one.Contains("=") == true)
            {
                string[] keyVal = Regex.Split(one, @"=");
                OutputToPrint.Append("<tr>");
                foreach (string val in keyVal)
                {
                    if (val != "")
                    {
                        OutputToPrint.Append("<td>");
                        OutputToPrint.Append(WebUtility.HtmlEncode(val));
                        OutputToPrint.Append("</td>");

                    }
                }

            }
            else if (one.Equals("{next}"))
            {
                OutputToPrint.Append("</tr>");
            }
            else if (one.Contains("}{next}") == true)
            {
                OutputToPrint.Append("</table></td>");
            }
            else if (one == "}")
            {
                OutputToPrint.Append("</table>");
            }
            else { }
        }

        return OutputToPrint.ToString();
    }

这将是查看

<div>
@Html.Raw(ViewBag.DisplayTable)
</div>

希望你能找到这个