Question

我正试图在不到400个文件夹中解析大约300k XML文件。

每个文件很小，对于前80到130个文件夹，每个文件平均达到0毫秒。但是，在处理了可变数量的文件后，每个文件平均显示为6毫秒。

在任何给定的运行中，减速不会在同一时间开始。

如果平均时间大于一毫秒（无效），我尝试在程序中添加Thread.Sleep(5000)和GC.Collect()。

在观察CPU，内存和磁盘使用情况时，它们似乎并没有失控。

这是我在本地计算机上运行的代码示例。我认为这里没有任何争议，除非XMLDocument.Load中有一些奇怪的地方

    public void Traversefolders()
    {
        string[] folders = Directory.GetDirectories(debug_root + @"\App_Data\AllPublicXML");
        Console.WriteLine($"Searching for data in {folders.Count()} folders");
        int total = folders.Length;
        for (int i = 0; i < total; i++) {
            ReadFiles(i, total, folders[i]);
        }
    }

    public void ReadFiles(int index, int total, string folder)
    {
        Stopwatch s = new Stopwatch();
        s.Start();

        string[] files = Directory.GetFiles($@"{folder}", "*.xml");

        foreach (string file in files) {
            XmlDocument doc = new XmlDocument();
            doc.Load(file);
            // READ XML DOC
            // at this point I am checking the value of a single 
            // field in the XML doc
            // eventually more will happen here, but I’m in early 
            // dev phases
            // and am prototyping
        }
        double avg = s.ElapsedMilliseconds / files.Count();
        Console.WriteLine($"Completed {Path.GetFileName(folder)} including {files.Length} files in {s.ElapsedMilliseconds} miliseconds (avg: {avg}) {index+1} of {total}");

        if (avg > 1) {
            Thread.Sleep(5000);
            GC.Collect();
        }

    }

在解析C＃中的大量文件时，有人有减速的经验吗？

这里可能出什么问题了？

解析数千个文件会导致速度变慢

0 个答案: