在C#中读写非常大的文本文件

时间:2016-06-09 11:44:20

标签: c# .net wpf streamreader streamwriter

我有一个非常大的文件,大小近2GB。我正在尝试编写一个进程来读取文件并在没有第一行的情况下将其写出来。我几乎只能一次读写一行,这需要永远。我可以打开它,删除第一行并在TextPad中保存得更快,但这仍然很慢。

我使用此代码获取文件中的记录数:

private long getNumRows(string strFileName)
{
    long lngNumRows = 0;
    string strMsg;

    try
    {
        lngNumRows = 0;
        using (var strReader = File.OpenText(@strFileName))
        {
            while (strReader.ReadLine() != null)
            {
                lngNumRows++;
            }

            strReader.Close();
            strReader.Dispose();
        }
    }
    catch (Exception excExcept)
    {
        strMsg = "The File could not be read: ";
        strMsg += excExcept.Message;
        System.Windows.MessageBox.Show(strMsg);
        //Console.WriteLine("Thee was an error reading the file: ");
        //Console.WriteLine(excExcept.Message);

        //Console.ReadLine();
    }

    return lngNumRows;
}

这只需要几秒钟就可以运行。当我添加以下代码时,它需要永远运行。难道我做错了什么?为什么写入会增加这么多时间?关于如何让它更快的任何想法?

private void ProcessTextFiles(string strFileName)
{
    string strDataLine;
    string strFullOutputFileName;
    string strSubFileName;
    int intPos;
    long lngTotalRows = 0;
    long lngCurrNumRows = 0;
    long lngModNumber = 0;
    double dblProgress = 0;
    double dblProgressPct = 0;
    string strPrgFileName = "";
    string strOutName = "";
    string strMsg;
    long lngFileNumRows;

    try
    {
       using (StreamReader srStreamRdr = new StreamReader(strFileName))
        {
            while ((strDataLine = srStreamRdr.ReadLine()) != null)
            {
                lngCurrNumRows++;

                if (lngCurrNumRows > 1)
                {
                    WriteDataRow(strDataLine, strFullOutputFileName);
                }
            }

            srStreamRdr.Dispose();
        }
    }
    catch (Exception excExcept)
    {
        strMsg = "The File could not be read: ";
        strMsg += excExcept.Message;
        System.Windows.MessageBox.Show(strMsg);
        //Console.WriteLine("The File could not be read:");
        //Console.WriteLine(excExcept.Message);
    }
}

public void WriteDataRow(string strDataRow, string strFullFileName)
{
    //using (StreamWriter file = new StreamWriter(@strFullFileName, true, Encoding.GetEncoding("iso-8859-1")))
    using (StreamWriter file = new StreamWriter(@strFullFileName, true, System.Text.Encoding.UTF8))
    {
        file.WriteLine(strDataRow);
        file.Close();
    }
}

2 个答案:

答案 0 :(得分:8)

不确定这会提高多少性能,但肯定的是,打开和关闭要写入的每一行的输出文件并不是一个好主意。

只需打开两个文件,然后直接写

using (StreamWriter file = new StreamWriter(@strFullFileName, true, System.Text.Encoding.UTF8))
using (StreamReader srStreamRdr = new StreamReader(strFileName))
{
    while ((strDataLine = srStreamRdr.ReadLine()) != null)
    {
        lngCurrNumRows++;

        if (lngCurrNumRows > 1)
           file.WriteLine(strDataRow);
    }
}

您还可以在进入while循环之前删除对lngCurrNumRow的检查,只需进行空读

strDataLine = srStreamRdr.ReadLine();
if(strDataLine != null)
{
    while ((strDataLine = srStreamRdr.ReadLine()) != null)
    {
           file.WriteLine(strDataRow);
    }
}

答案 1 :(得分:0)

取决于机器的内存。您可以尝试以下(我的大文件是" D:\ savegrp.log"我有一个2gb文件敲门)当我尝试它时使用了大约6GB的内存

private void Convert()
{
    Control.CheckForIllegalCrossThreadCalls = false;

    if (ComboBox1.SelectedIndex == 3)
    {
        strFFCMD = " -i \"" + InputFile + "\" \"" + OutputFile + "\"";
    }

    if (ComboBox1.SelectedIndex == 2) 
    {
        strFFCMD = " -i " + (char)34 + InputFile + (char)34 +
        " -c:v libx264 -s 1280x720 -pix_fmt yuv420p -qp 20 -profile high444-c:a libvo_aacenc -b:a 128k -ar 44100 -ac 2 " + OutputFile;
    }

    psiProcInfo.FileName = exepath;
    psiProcInfo.Arguments = strFFCMD;        
    psiProcInfo.UseShellExecute = false;      
    psiProcInfo.WindowStyle = ProcessWindowStyle.Hidden;    
    psiProcInfo.RedirectStandardError = true;             
    psiProcInfo.RedirectStandardOutput = true;         
    psiProcInfo.CreateNoWindow = true;                 
    prcFFMPEG.StartInfo = psiProcInfo;           
    prcFFMPEG.Start();
    ffReader = prcFFMPEG.StandardError;

    do
    {
        if (Bgw1.CancellationPending)
        {
            return;
        }
        Button5.Enabled = true;
        Button3.Enabled = false;
        strFFOUT = ffReader.ReadLine();
        RichTextBox1.Text = strFFOUT;                
        if (strFFOUT.Contains("frame="))
        {
            currentFramestr = strFFOUT.Substring(7, 6);
            currentFramestr = currentFramestr.Trim();
            currentFrameInt = System.Convert.ToInt32(currentFramestr, 16);
        }
        string percentage = System.Convert.ToInt32((ProgressBar1.Value / ProgressBar1.Maximum * 100)).ToString() + "%";
        ProgressBar1.Maximum = FCount + 1000;
        ProgressBar1.Value = (currentFrameInt);
        Label12.Text = "Current Encoded Frame: " + currentFrameInt;
        Label11.Text = percentage;
    } while (!(prcFFMPEG.HasExited || string.IsNullOrEmpty(strFFOUT)));
}

这取决于可用的内存..

int counter = File.ReadAllLines(@"D:\savegrp.log").Length;
Console.WriteLine(counter);