C#逐行解析文本数据

时间:2011-04-14 09:15:49

标签: c# regex parsing text

我的项目有问题,让我在这里解释一下,我需要一些专家指导,因为我是编程的新手。 我在记事本中有这样的数据:

10192 20351 30473 40499 50449 60234    
10192 20207 30206 40203 50205 60226    
10192 20252 30312 40376 50334 60252

这是26行数据,但我只是举例说明了3行。这里有一些规则按优先顺序排序:

- 我只想读取文本文件然后提取数字。例如:10192 20351等等。

- 我有6列ListView,我想在它的列

中显示每一行数字
  

第1栏|第2栏|第3栏|第4栏|第5栏|第6栏

10192   | 20351   | 30473  | 40499  |50449  | 60234

-off course如果可能的话,每个5位数的前2位数是唯一的代码,我想要的只是最后一位数。例如:192 351 473 499 234.因此每个数字的修正值为10.000。

我想我很混淆你们,对不起,这是我现在的代码

private delegate void UpdateUiTextDelegate(String Text);         private void serial_DataRecieved(object sender,System.IO.Ports.SerialDataReceivedEventArgs e)         {             //将收到的字符收集到我们的'缓冲区'(字符串)。             string received_data;             received_data = serial.ReadExisting();             Dispatcher.Invoke(DispatcherPriority.Send,new UpdateUiTextDelegate(WriteData),received_data);         }

    private void WriteData(String Text)
    {
        if (bufferData != "" || Text[0] == '1')
            bufferData += Text;
        if (bufferData.Length >= 35)
        {
            using (System.IO.StreamWriter file = new System.IO.StreamWriter(@"C:\Users\Rads\Desktop\Training06.txt", true))
            {
                file.WriteLine(bufferData);
            }
            listBox1.Items.Add(bufferData);
            bufferData = "";
        }
    }
    #endregion

    private void Window_Loaded(object sender, RoutedEventArgs e)
    {

    }

    //Browse .txt file
    private void Browse_btn_Click(object sender, RoutedEventArgs e)
    {
        Microsoft.Win32.OpenFileDialog dlg = new Microsoft.Win32.OpenFileDialog();

        dlg.DefaultExt = ".txt";
        dlg.Filter = "Text document (.txt)|*.txt";

        Nullable<bool> result = dlg.ShowDialog();

        if (result == true)
        {
            string filename = dlg.FileName;
            textBox.Text = filename;
        }
    }

    private void Parsing_String(string filename)
    {
        List<Row> list = new List<Row>();

        foreach (String str in File.ReadLines(filename))
        {
            String[] strCols = str.Split(Convert.ToChar(" "));
            list.Add(new Row()
            {
                Column1 = strCols[0].Substring(2),
                Column2 = strCols[1].Substring(2),
                Column3 = strCols[2].Substring(2),
                Column4 = strCols[3].Substring(2),
                Column5 = strCols[4].Substring(2),
                Column6 = strCols[5].Substring(2),


            });
        }

        dg.ItemsSource = list;
    }

    public class Row
    {
        public string Column1 { get; set; }
        public string Column2 { get; set; }
        public string Column3 { get; set; }
        public string Column4 { get; set; }
        public string Column5 { get; set; }
        public string Column6 { get; set; }

    }

XAML代码

<Window x:Class="SamplingData.MainWindow"
    xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
    xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
    Title="MainWindow" Height="368" Width="401" Loaded="Window_Loaded">

<TabControl Height="332" HorizontalAlignment="Left" Name="tabControl1" VerticalAlignment="Top" Width="380">
    <TabItem Header="Sampling" Name="Sampling">
        <Grid>
            <Label Content="DATA RECEIVED" Height="28" HorizontalAlignment="Left" Margin="6,6,0,0" Name="label1" VerticalAlignment="Top" />
            <Button Content="Connect" Height="23" HorizontalAlignment="Left" Margin="264,6,0,0" Name="ConnectButton" VerticalAlignment="Top" Width="75" Click="Connect_Comms" />
            <ListBox Height="222" HorizontalAlignment="Left" Margin="10,37,0,0" Name="listBox1" VerticalAlignment="Top" Width="329" />
        </Grid>
    </TabItem>
    <TabItem Header="Training" Name="tabItem1">
        <Grid>
            <Button Content="Training" Height="23" HorizontalAlignment="Left" Margin="243,28,0,0" Name="Train_Btn" VerticalAlignment="Top" Width="75" />
            <Button Content="Browse" Height="23" HorizontalAlignment="Left" Margin="243,6,0,0" Name="Browse_btn" VerticalAlignment="Top" Width="75" Click="Browse_btn_Click" />
            <TextBox Height="23" HorizontalAlignment="Left" Margin="6,7,0,0" Name="textBox" VerticalAlignment="Top" Width="231" Background="{x:Null}"></TextBox>
            <RadioButton Content="RadioButton" Height="16" HorizontalAlignment="Left" Margin="507,76,0,0" Name="radioButton1" VerticalAlignment="Top" />
            <DataGrid x:Name="dg" AutoGenerateColumns="False" Margin="0,57,0,0" DataContext="{Binding}">
                <DataGrid.Columns>
                    <DataGridTextColumn Binding="{Binding Column1}" Header="Column 1"></DataGridTextColumn>
                    <DataGridTextColumn Binding="{Binding Column2}" Header="Column 2"></DataGridTextColumn>
                    <DataGridTextColumn Binding="{Binding Column3}" Header="Column 3"></DataGridTextColumn>
                    <DataGridTextColumn Binding="{Binding Column4}" Header="Column 4"></DataGridTextColumn>
                    <DataGridTextColumn Binding="{Binding Column5}" Header="Column 5"></DataGridTextColumn>
                    <DataGridTextColumn Binding="{Binding Column6}" Header="Column 6"></DataGridTextColumn>
                </DataGrid.Columns>
            </DataGrid>
        </Grid>
    </TabItem>
</TabControl>

非常感谢有人可以帮助我..

6 个答案:

答案 0 :(得分:2)

我认为你根本不需要正则表达式。你只需要按行和空格string.Split

string[] lines = data.Split(Enviroment.NewLine);

对于每一行,您可以通过按空格分割线来获取字段。

string[] fields = line.Split(' ');

答案 1 :(得分:1)

编辑:根据Ash建议添加行变量:

鉴于您没有指定要写入数据的对象,我认为它是一个名为Row的数据行:

private void Parsing_String(string filename)    
{
    DataTable dt = CreateDataTable();
    foreach (String str in File.ReadLines(filename))
    {
      String[] strCols = str.Split(Convert.ToChar(" "));
      DataRow Row = dt.NewRow(); //Where dt is a DataTable
      for (int i =0; i < strCols.length; i++)
      {
           Row[i] = strCols[i].Substring(2); //This will start reading from the third character
      }
      dt.Rows.Add(Row);
     }
      listView1.ItemsSource = dt.Rows;
}

//**EDIT**: Just in case you don't have a datatable and you want to create a small one:

public DataTable CreateDataTable()
    {
        DataTable dt = new DataTable();

        new string[] { "Column 1", "Column 2", "Column 3", "Column 4", "Column 5", "Column 6" }
            .ToList()
            .ForEach(c => { dt.Columns.Add(new DataColumn(c)); });
        return dt;
    }

编辑:上次尝试(忽略上面的代码):

    private void Parsing_String(string filename)
    {
        List<Row> list = new List<Row>();

        foreach (String str in File.ReadLines(filename))
        {
            String[] strCols = str.Split(Convert.ToChar(" "));
            list.Add(new Row() 
            {
                Column1 = strCols[0].Substring(2),
                Column2 = strCols[1].Substring(2),
                Column3 = strCols[2].Substring(2),
                Column4 = strCols[3].Substring(2),
                Column5 = strCols[4].Substring(2),
                Column6 = strCols[5].Substring(2)
            });
        }

        dg.ItemsSource = list;
    }

    public class Row
    {
        public string Column1 { get; set; }
        public string Column2 { get; set; }
        public string Column3 { get; set; }
        public string Column4 { get; set; }
        public string Column5 { get; set; }
        public string Column6 { get; set; }
    }

然后在你的xaml中:     

如果要指定自定义标题,则必须更改它,以便它不会自动生成列但使用绑定,即:。

   <DataGrid x:Name="dg" AutoGenerateColumns="False">
        <DataGrid.Columns>
            <DataGridTextColumn Binding="{Binding Column1}" Header="Column 1"></DataGridTextColumn>
            <DataGridTextColumn Binding="{Binding Column2}" Header="Column 2"></DataGridTextColumn>
            <DataGridTextColumn Binding="{Binding Column3}" Header="Column 3"></DataGridTextColumn>
            <DataGridTextColumn Binding="{Binding Column4}" Header="Column 4"></DataGridTextColumn>
            <DataGridTextColumn Binding="{Binding Column5}" Header="Column 5"></DataGridTextColumn>
            <DataGridTextColumn Binding="{Binding Column6}" Header="Column 6"></DataGridTextColumn>
        </DataGrid.Columns>
    </DataGrid>

修改这将计算第一列总数

protected int CalculateFirstColumnTotal(List<Row> list)
{
    int total = 0;
    foreach (Row row in list)
      total += int.Parse(row.Column1);
}

修改

您永远不会实际调用Parsing_String方法,将以下行添加到browse方法:

private void Browse_btn_Click(object sender, RoutedEventArgs e)    
{
    //Existing Code
    Parsing_String(textBox.Text);  //Add this line to the last line of the method.
}

答案 2 :(得分:1)

或许这样的事情:

var result = from row in theFileAsString.Split('\n')
             select new {
                Columns = row.Split(' ').Select(s => s.Substring(2))
             }

您将拥有IEnumerable个项目,每个项目都有一个包含数据字符串的属性Columns

虽然没有经过考验,但你明白了。

答案 3 :(得分:1)

好吧,这可能有些过分,但你提到你想把数据放到listview(也许是数据网格?),在这种情况下,你可能想要将数据转换成某种对象形式。这实际上取决于你获得数据后实际上要对数据做些什么。

假设一旦你获得了数据,你就会想要操纵它或者用它做更多的事情,试试这样的事情 - 你应该能够把它直接放到一个新的控制台应用程序中运行它

namespace ConsoleApplication6
{
    using System;
    using System.Collections.Generic;
    using System.IO;
    using System.Text.RegularExpressions;

    class Program
    {
        static void Main(string[] args)
        {
            string filename = @"c:\test.txt";

            // Because you're working with a small file, we'll just read all the lines into memory
            List<LineData> processedLines = new List<LineData>();
            foreach (var line in File.ReadAllLines(filename))
            {
                processedLines.Add(new LineData(line));
            }

            // Write out the line data to the console to prove that it has been read
            foreach (var processedLine in processedLines)
            {
                Console.WriteLine(
                    "{0},{1},{2},{3},{4},{5}", 
                    processedLine.Column1, 
                    processedLine.Column2,
                    processedLine.Column3,
                    processedLine.Column4,
                    processedLine.Column5,
                    processedLine.Column6);
            }
        }
    }

    public class LineData
    {
        public LineData(string line)
        {
            // Regex basically means find two digits ("Prefix") followed by 3 digits ("Value")
            Regex regex = new Regex(@"(?<Prefix>\d{2})(?<Value>\d{3})");
            var lineMatches = regex.Matches(line);
            if (lineMatches.Count != 6)
            {
                // You should really be throwing your own exception type...
                throw new Exception("Expected 6 columns!");
            }

            this.Column1 = this.ExtractMatchData(lineMatches[0]);
            this.Column2 = this.ExtractMatchData(lineMatches[1]);
            this.Column3 = this.ExtractMatchData(lineMatches[2]);
            this.Column4 = this.ExtractMatchData(lineMatches[3]);
            this.Column5 = this.ExtractMatchData(lineMatches[4]);
            this.Column6 = this.ExtractMatchData(lineMatches[5]);
        }

        private string ExtractMatchData(Match match)
        {
            return match.Groups["Value"].Value;
        }

        public string Column1 { get; set; }
        public string Column2 { get; set; }
        public string Column3 { get; set; }
        public string Column4 { get; set; }
        public string Column5 { get; set; }
        public string Column6 { get; set; }
    }
}

答案 4 :(得分:0)

using (var stream = File.Open(this.filename, FileMode.Open, FileAccess.Read)
{
    var reader = new StreamReader(stream);
    var data = reader.ReadLine();
    while (!String.IsNullOrWhitespace(data))
    {
        string[] columns = data.Split(' ');
        Console.WriteLine(string.Format("{0} {1} {2} {3} {4} {5}", columns[0], columns[1],));
        data = reader.ReadLine();
    }
}

答案 5 :(得分:0)

您不必为此使用正则表达式。

var data = @"10192 20351 30473 40499 50449 60234
10192 20207 30206 40203 50205 60226
10192 20252 30312 40376 50334 60252";

var result = from line in data.Split(new[] { "\r\n", "\n" }, StringSplitOptions.RemoveEmptyEntries)
             let splitted = line.Split(new[] {' '}, StringSplitOptions.RemoveEmptyEntries)
             select splitted.Select(s => s.Substring(2));

或文件

using System.IO; // In the top

var result = from line in File.ReadLines("path")
             let splitted = line.Split(new[] {' '}, StringSplitOptions.RemoveEmptyEntries)
             select splitted.Select(s => s.Substring(2));

result现在将包含一系列字符串序列(其中前两个字符被删除)。此版本适用于Unix和Windows换行符。它还会删除可能导致其他答案失败的其他空格。