正则表达式命令循环中的内存不足错误?

时间:2016-08-05 23:02:57

标签: c# regex

运行正则表达式后,我将字符串设置为null并强制进行垃圾回收,但是我遇到了内存不足错误。

垃圾收集器的手动调用没有做任何事情。我使用null作为空字符串。

int i = 0;

while (i < 40 )
{
    string strFile42 = File.ReadAllText(@"C:\Users\diego\Desktop\finalregex.txt"); 
    string Value3 = @"\n(.*?)&" ;
    string Value4 =string.Format("\n$1#{0}#", i);
    //Force garbage collection.
    GC.Collect();
    strFile42 = Regex.Replace(strFile42, Value3, Value4);
    //Force garbage collection.
    GC.Collect();

    Value4 = null;
    Value3 = null;
    GC.Collect();
    GC.WaitForPendingFinalizers();

    File.WriteAllText(@"C:\Users\diego\Desktop\finalregex.txt", strFile42);
    strFile42 =null;

    GC.Collect();
    GC.WaitForPendingFinalizers();
    i = i + 1;
}

=============================================== ===========

这是我的解决方案

      string strFile42 =  File.ReadAllText(@"C:\Users\diego\Desktop\finalregex.txt");


 \\ regex code for replace (PUT REGEX CODE HERE )


    File.WriteAllText(@"C:\Users\diego\Desktop\finalregex.txt", strFile42);


    strFile42 = null ;

  int oldCacheSize = Regex.CacheSize;

Regex.CacheSize         = 0;
  GC.Collect();

    Regex.CacheSize = oldCacheSize; 

  GC.Collect();

 GC.WaitForPendingFinalizers();

3 个答案:

答案 0 :(得分:4)

使用时为字符串分配巨大的大小:      string strFile42 = File.ReadAllText(filename); 并且一次做所有那么大的Regex.Replace代表了内存的开销。

字符串在大对象堆上分配,GC不会将其与其他小对象一起收集。  当你调用GC.Collect()来撤销垃圾收集器时,GC CAN NOT 保证会立即收集字符串。

因此,最好将BufferedStream与StreamReader一起使用,并一次处理一行,而不会对内存产生任何开销。

我测量了执行中用来监视内存分配的内存。

以下类可以毫无问题地处理大量文件

  public void FileProcess()
    {
        Process proc = Process.GetCurrentProcess();
        int i = 0;
        while (i < 40)
        {
            Console.WriteLine(" i: {0} - Private memory: {1} KB- Memory Used: {2} KB", i,
                proc.PrivateMemorySize64/1024.0, GC.GetTotalMemory(true)/1024.0);    
            var path = "test0.txt";
            var path2 = "test2.txt";
            using (FileStream fs = File.Open(path, FileMode.Open, FileAccess.Read, FileShare.Read))
            using (BufferedStream bs = new BufferedStream(fs))
            using (StreamReader sr = new StreamReader(bs))
            using (StreamWriter outputFile = new StreamWriter(path2))
            {
                string line;
                while ((line = sr.ReadLine()) != null)
                {

                    var text = ProcessLine(line, i);
                    outputFile.WriteLine(text);
                }
            }
            i++;
            // Collect all generations of memory.
            // GC.Collect(); //you need not

        } //while
    }

    private string ProcessLine(string text, int i)
    {
        string Value3 = @"(.*?)&";
        string Value4 = string.Format("$1#{0}#", i);
        var strFile42 = Regex.Replace(text, Value3, Value4);
        return strFile42;
    }
}

Bench Mark Performance Test:

I使用大小为72 MB的字符串生成文件,并使用该类处理文件40次,完全没有任何开销。 如您所见,应用程序使用的内存大约为252K,并且一直没有GC收集。

结果

     i: 0 - Private memory: 18212 KB- Memory Used: 251.8828125 KB
     i: 1 - Private memory: 18212 KB- Memory Used: 274.5390625 KB
     i: 2 - Private memory: 18212 KB- Memory Used: 274.5390625 KB
     i: 3 - Private memory: 18212 KB- Memory Used: 274.5390625 KB
     i: 4 - Private memory: 18212 KB- Memory Used: 274.5390625 KB
     i: 5 - Private memory: 18212 KB- Memory Used: 274.5390625 KB
     i: 6 - Private memory: 18212 KB- Memory Used: 274.5390625 KB
     i: 7 - Private memory: 18212 KB- Memory Used: 274.48828125 KB
     i: 8 - Private memory: 18212 KB- Memory Used: 274.48828125 KB
     i: 9 - Private memory: 18212 KB- Memory Used: 274.48828125 KB
     i: 10 - Private memory: 18212 KB- Memory Used: 274.48828125 KB
     i: 11 - Private memory: 18212 KB- Memory Used: 274.48828125 KB
     i: 12 - Private memory: 18212 KB- Memory Used: 274.48828125 KB
     i: 13 - Private memory: 18212 KB- Memory Used: 274.48828125 KB
     i: 14 - Private memory: 18212 KB- Memory Used: 274.48828125 KB
     i: 15 - Private memory: 18212 KB- Memory Used: 274.48828125 KB
     i: 16 - Private memory: 18212 KB- Memory Used: 274.48828125 KB
     i: 17 - Private memory: 18212 KB- Memory Used: 274.48828125 KB
     i: 18 - Private memory: 18212 KB- Memory Used: 274.48828125 KB
     i: 19 - Private memory: 18212 KB- Memory Used: 274.48828125 KB
     i: 20 - Private memory: 18212 KB- Memory Used: 274.48828125 KB
     i: 21 - Private memory: 18212 KB- Memory Used: 274.48828125 KB
     i: 22 - Private memory: 18212 KB- Memory Used: 274.48828125 KB
     i: 23 - Private memory: 18212 KB- Memory Used: 274.48828125 KB
     i: 24 - Private memory: 18212 KB- Memory Used: 274.48828125 KB
     i: 25 - Private memory: 18212 KB- Memory Used: 274.48828125 KB
     i: 26 - Private memory: 18212 KB- Memory Used: 274.48828125 KB
     i: 27 - Private memory: 18212 KB- Memory Used: 274.48828125 KB
     i: 28 - Private memory: 18212 KB- Memory Used: 274.48828125 KB
     i: 29 - Private memory: 18212 KB- Memory Used: 274.48828125 KB
     i: 30 - Private memory: 18212 KB- Memory Used: 274.48828125 KB
     i: 31 - Private memory: 18212 KB- Memory Used: 274.48828125 KB
     i: 32 - Private memory: 18212 KB- Memory Used: 274.48828125 KB
     i: 33 - Private memory: 18212 KB- Memory Used: 274.48828125 KB
     i: 34 - Private memory: 18212 KB- Memory Used: 274.48828125 KB
     i: 35 - Private memory: 18212 KB- Memory Used: 274.48828125 KB
     i: 36 - Private memory: 18212 KB- Memory Used: 274.48828125 KB
     i: 37 - Private memory: 18212 KB- Memory Used: 274.48828125 KB
     i: 38 - Private memory: 18212 KB- Memory Used: 274.48828125 KB
     i: 39 - Private memory: 18212 KB- Memory Used: 274.48828125 KB

答案 1 :(得分:3)

我想知道正则表达式的cashSize是不是问题。

在MSDN信息中,Regex

上有一个页面

引用:

  

Regex类维护静态方法调用中使用的已编译正则表达式的内部缓存。如果set操作中指定的值小于当前高速缓存大小,则会丢弃高速缓存条目,直到高速缓存大小等于指定的值。   默认情况下,缓存包含15个编译的静态正则表达式。您的应用程序通常不必修改缓存的大小。仅当您要关闭缓存或具有异常大的缓存时才使用CacheSize属性。

Here是关于正则表达式CashSize的问题

答案 2 :(得分:1)

我在这里发表答案

我使用此代码进行清理内存泄漏

        string strFile42 =  File.ReadAllText(@"C:\Users\diego\Desktop\finalregex.txt");


  \\ regex code for replace (PUT REGEX CODE HERE )


     File.WriteAllText(@"C:\Users\diego\Desktop\finalregex.txt", strFile42);


      strFile42 = null ;

   int oldCacheSize = Regex.CacheSize;

  Regex.CacheSize         = 0;
   GC.Collect();

        Regex.CacheSize = oldCacheSize; 

    GC.Collect();

GC.WaitForPendingFinalizers();