如何逐块处理这个文本?

时间:2017-08-02 10:48:56

标签: c# text-files block sequencefile

我想分别按块处理数据

以下是文字:

[全局]
ASD
DSA
AKL
ASD

[Test2的]
bnmnb
hkhjk
tzutzi
Tzutzi
Tzitzi

[Test3的]
5675
46546个
464个
564个
56456个
45645654个
4565464个

[其他]
sdfsd
DSF
SDF
DSF的

首先,我想要第一个块并处理它而不是第二个......等等。

private void textprocessing(string filename)
{
    using (StreamReader sr1 = new StreamReader(filename))
    {
        string linetemp = "";
        bool found = false;
        int index = 0;

        while ((linetemp=sr1.ReadLine())!=null)
        {
            if (found==true)
            {
                MessageBox.Show(linetemp);
                break;   
            }

            if (linetemp.Contains("["))
            {
                found = true;
            }
            else
            {
                found = false;
            }                                                             
        }                                    
    }          
}

2 个答案:

答案 0 :(得分:1)

您可以使用string.Split()根据" ["然后根据换行符拆分。你检查了#34;]"

的存在
void Main()
{
    string txt = @"[Global]
asd
dsa
akl
ASd

[Test2]
bnmnb
hkhjk
tzutzi
Tzutzi
Tzitzi

[Test3]
5675
46546
464
564
56456
45645654
4565464

[other]
sdfsd
dsf
sdf
dsfs";

    string[] split = txt.Split('[');
    foreach(var s in split)
    {
        var subsplits = s.Split(new string[] { Environment.NewLine }, StringSplitOptions.None);
        Console.WriteLine(subsplits[0]);
        foreach(var ss in subsplits)
        {
            if(!ss.Contains("]"))
                Console.WriteLine(ss);
        }
    }
}

此输出

asd
dsa
akl
ASd


bnmnb
hkhjk
tzutzi
Tzutzi
Tzitzi


5675
46546
464
564
56456
45645654
4565464


sdfsd
dsf
sdf
dsfs

您可以添加附加检查以检查它是否为空行并忽略它。

答案 1 :(得分:0)

这是一种方法:

private void ReadFile()
{
    //load all  lines
    var lines = File.ReadAllLines(@"c:\temp\file.txt").ToList().;
    //remove empty lines
    lines = lines.Where(l => l.Trim().Length > 0).ToList();
    //mark indexes where sections start
    var sectionIndexes = lines
        .Where(l => l.StartsWith("[") && l.EndsWith("]"))
        .Select(l => lines.IndexOf(l)).ToList();

    //now make list of tuples. Each tuple contains start of section (Item1) and last line of section (Item2)
    var sections = Enumerable.Zip(sectionIndexes, sectionIndexes.Skip(1), (a, b) => new Tuple<int, int>(a, b-1)).ToList();

    //for each tuple (each section)
    foreach (var item in sections)
    {
        //process section name (line with raound brackets
        ProcessSection(lines[item.Item1], lines.Where(l => lines.IndexOf(l) > item.Item1 && lines.IndexOf(l) <= item.Item2));
    }
}

private void ProcessSection(string sectionName, IEnumerable<string> lines)
{
    Console.WriteLine("this is section {0} with following lines: {1}", sectionName, string.Join(", ", lines.ToArray()));
}

ProcessSection方法的输出将是:

this is section [Global] with following lines: asd, dsa, akl, ASd
this is section [Test2] with following lines: bnmnb, hkhjk, tzutzi, Tzutzi, Tzitzi
this is section [Test3] with following lines: 5675, 46546, 464, 564, 56456, 45645654, 4565464

这是一个非常快速和肮脏的解决方案,但如果您阅读的文件很小就足够了。

如果您还有其他问题,请随时提出。