将大文本文件输出解析为另一个文本文件

时间:2013-12-17 16:56:56

标签: c# linq parsing text-files

我想解析一个大文本文件,如果该行包含某个子字符串,则将该行附加到我的新文本文件中。我需要内存使用率最低的解决方案,这是我到目前为止,评论是我需要帮助添加:

.
.
.
if (File.ReadLines(filepath).Any(line => line.Contains(myXML.searchSTRING)))
{

// code to grab that line and append it to the a new text file 
// if new text file doesn't exist then create it.
// All text files im parsing have the same header, I want to grab
// the third line and use it as my new text file header. 
// Only write the header once, I do not want it written every time a new 
// text file is opened for parsing 

}

5 个答案:

答案 0 :(得分:7)

尝试:

var count = 1;
File.WriteAllLines(newFilePath, 
  File.ReadLines(filepath)
  .Where(count++ == 3 || l => l.Contains(myXML.searchSTRING))
);

WriteAllLines()ReadLines()都使用枚举器,因此内存使用率相对较低。

我不知道你怎么知道只写一次标题,这取决于你如何打开可用的文件列表。他们在一个阵列?如果是这样,请将File.WriteAllLines调用包装在该数组的foreach循环中。

答案 1 :(得分:1)

这样的事情应该这样做(编辑以反映@ JimMischel的评论):

private static void WriteFile(string mySearchString, string fileToWrite, params string[] filesToRead)
{
    using (var sw = new StreamWriter(fileToWrite, true))
    {
        var count = 1;

        foreach (var file in filesToRead)
        {
            using (var sr = new StreamReader(file))
            {
                string line;

                while ((line = sr.ReadLine()) != null)
                {
                    if (count == 3)
                    {
                        sw.WriteLine(line);
                    }
                    if (count > 3 && line.Contains(mySearchString))
                    {
                        sw.WriteLine(line);
                    }

                    count++;
                }
            }
        }
    }
}

你会这样称呼:

WriteFile("Foobar", "fileToWrite.txt", "input1.txt", "input2.txt", "input3.txt");

答案 2 :(得分:0)

您可以使用StreamWriter

using (var fs = new FileStream(outpuFilePath, FileMode.Append, FileAccess.Write))
{
    using (var sw = new StreamWriter(fs))
    {
        foreach (var line in File.ReadLines(filepath).Where(line => line.Contains(myXML.searchSTRING)))
        {
            sw.WriteLine(line);
        }
    }
}

答案 3 :(得分:0)

我认为最重要的是使用“Where”而不是“Any”Any返回true / false,如果集合匹配,而你想要过滤集合。下面的内容应该与上面的答案结合起来(尽管我会使用Linq)。

StreamWriter outFile = new StreamWriter("output.txt");
string filepath = "infile.txt";
var header=File.ReadLines(filepath).Skip(2).First();
outFile.WriteLine(header);
var searchString = "temp";
File.ReadLines(filepath).Where(x => x.Contains(searchString))
.Select(x =>outFile.WriteLine(x));

答案 4 :(得分:-1)

请阅读MemoryMappedFile的文章

http://www.dotnetperls.com/memorymappedfile-benchmark