如何从GetFiles获取文件组()

时间:2017-12-06 23:18:19

标签: c#

我必须每天处理文件。这些文件的名称如下:

 fg1a.mmddyyyy
 fg1b.mmddyyyy
 fg1c.mmddyyyy

 fg2a.mmddyyyy
 fg2b.mmddyyyy
 fg2c.mmddyyyy
 fg2d.mmddyyyy

如果整个文件组都在特定日期,我可以处理它。如果它不存在,我不应该处理它。我可能有几个部分文件组运行了几天。所以,当我有fg1a.12062017,fg1b.12062017和fg1c.12062017时,我只能处理该组(fg1)。

到目前为止,这是我的代码。它不起作用,因为我无法弄清楚如何只将完整的组添加到处理文件列表中。

        fileList = Directory.GetFiles(@"c:\temp\");

        string[] fileGroup1 = { "FG1A", "FG1B", "FG1C" }; // THIS IS A FULL GROUP
        string[] fileGroup2 = { "FG2A", "FG2B", "FG2C", "FG2D" };

        List<string> fileDates = new List<string>();
        List<string> procFileList;

        // get a list of file dates
        foreach (string fn in fileList)
        {
            string dateString = fn.Substring(fn.IndexOf('.'), 9);
            if (!fileDates.Contains(dateString))
            {
                fileDates.Add(dateString);
            }
        }

        bool allFiles = true;
        foreach (string fg in fileGroup1)
        {
            foreach (string fd in fileDates)
            {
                string finder = fg + fd;
                bool foundIt = false;
                foreach (string fn in fileList)
                {
                    if (fn.ToUpper().Contains(finder))
                    {
                        foundIt = true;
                    }

                }
                if (!foundIt)
                {
                    allFiles = false;
                }

                else
                {    
                    foreach (string fn in fileList)
                    {
                        procFileList.Add(fn);
                    }

                }

            }

        }

        foreach (string fg in fileGroup2)
        {
            foreach (string fd in fileDates)
            {
                string finder = fg + fd;
                bool foundIt = false;
                foreach (string fn in fileList)
                {
                    if (fn.ToUpper().Contains(finder))
                    {
                        foundIt = true;
                    }

                }
                if (!foundIt)
                {
                    allFiles = false;
                }
                else
                {    
                    foreach (string fn in fileList)
                    {
                        procFileList.Add(fn);
                    }

                }
            }

        }

非常感谢任何帮助或建议。

5 个答案:

答案 0 :(得分:2)

因为它有时会处理多个列表,分组和解析文件名,所以我首先要创建一个代表FileGroupItem的类。此类将具有Parse方法,该方法接收文件路径,然后具有表示文件名的组部分和日期部分的属性,以及文件的完整路径:

public class FileGroupItem
{
    public string DatePart { get; set; }
    public string GroupName { get; set; }
    public string FilePath { get; set; }

    public static FileGroupItem Parse(string filePath)
    {
        if (string.IsNullOrWhiteSpace(filePath)) return null;

        // Split the file name on the '.' character to get the group and date parts
        var fileParts = Path.GetFileName(filePath).Split('.');
        if (fileParts.Length != 2) return null;

        return new FileGroupItem
        {
            GroupName = fileParts[0],
            DatePart = fileParts[1],
            FilePath = filePath
        };
    }            
}

然后,在我的主代码中,我将创建一个文件组定义列表,然后从我们正在扫描的目录中填充FileGroupItems列表。之后,我们可以通过将它的项目(以不区分大小写的方式)与我们在目录中找到的实际FileGroupItems进行比较来确定是否完成了任何文件组定义(在首先将FileGroupItems分组之后DatePart)。如果这两个列表的交集与文件组定义具有相同数量的项目,那么它就完成了,我们可以处理该组。

也许它会在代码中更有意义:

private static void Main()
{
    var scanDirectory = @"f:\public\temp\";
    var processedDirectory = @"f:\public\temp2\";

    // The lists that define a complete group
    var fileGroupDefinitions = new List<List<string>>
    {
        new List<string> {"FG1A", "FG1B", "FG1C"},
        new List<string> {"FG2A", "FG2B", "FG2C", "FG2D"}
    };

    // Populate a list of FileGroupItems from the files 
    // in our directory, and group them on the DatePart
    var fileGroups = Directory.EnumerateFiles(scanDirectory)
        .Select(FileGroupItem.Parse)
        .GroupBy(f => f.DatePart);

    // Now go through each group and compare the items 
    // for that date with our file group definitions
    foreach (var fileGroup in fileGroups)
    {
        foreach (var fileGroupDefinition in fileGroupDefinitions)
        {
            // Get the intersection of the group definition and this file group
            var intersection = fileGroup
                .Where(f => fileGroupDefinition.Contains(
                    f.GroupName, StringComparer.OrdinalIgnoreCase))
                .ToList();

            // If all the items in the definition are there, then process the files
            if (intersection.Count == fileGroupDefinition.Count)
            {
                foreach (var fileGroupItem in intersection)
                {
                    Console.WriteLine($"Processing file: {fileGroupItem.FilePath}");

                    // Move the file to the processed directory
                    File.Move(fileGroupItem.FilePath,
                        Path.Combine(processedDirectory,
                            Path.GetFileName(fileGroupItem.FilePath)));
                }
            }
        }
    }

    Console.WriteLine("\nDone!\nPress any key to exit...");
    Console.ReadKey();
}

答案 1 :(得分:1)

我认为你可以简化你的算法,所以你只需要文件组作为前缀和一些期望的文件,fg1是给定日期的3个文件

我认为你的代码找到不同的日期是一个好主意,虽然你应该使用哈希集而不是列表,如果你偶尔会期望大量的日期..(&#34; Valentine&#39;那天?&#34; - Ed)

然后你只需要处理进行检查的另一个循环。像这样的算法

//make a new Dictionary<string,int> for the filegroup prefixes and their counts3 
//eg myDict["fg1"] = 3; myDict["fg2"] = 4;

//list the files in the directory, into an array of fileinfo objects
//see the DirectoryInfo.GetFiles method

//foreach string d in the list of dates
 //foreach string fgKey in myDict.Keys - the list of group prefixes

 //use a bit of Linq to get all the fileinfos with a 
 //name starting with the group and ending with the date
 var grplist = myfileinfos.Where(fi => fi.Name.StartsWith(fg) && fi.Name.EndsWith(d));

 //if the grplist.Count == the filegroup count ( myDict[fgKey] )
 //then send every file in grplist for processing
 //remember that grplist is a collection of fileinfo objects,
 //if your processing method takes a string filename, use fileinfo.Fullname

将文件分组放入一个字典会比将它们作为x个单独的数组更容易

我还没有为你编写所有代码,但我已经评论了草图算法,并且我已经放入了一些比较尴尬的内容,如链接,字典声明以及如何填写它..用代码充实它,在这篇文章的评论中提出任何问题

答案 2 :(得分:1)

首先,创建一个组数组以使处理更容易:

var fileGroups = new[] {
        new[] { "FG1A", "FG1B", "FG1C" },
        new[] { "FG2A", "FG2B", "FG2C", "FG2D" }
    };

然后,您可以将数组转换为Dictionary,以将每个名称映射回其组:

var fileGroupMap = fileGroups.SelectMany(g => g.Select(f => new { key = f, group = g })).ToDictionary(g => g.key, g => g.group);

然后,预处理从目录中获取的文件:

var fileList = from fname in Directory.GetFiles(...)
               select new {
                   fname,
                   fdate = Path.GetExtension(fname),
                   ffilename = Path.GetFileNameWithoutExtension(fname).ToUpper()
               };

现在,您可以按日期和组进行fileList和分组,然后过滤到刚刚完成的群组:

var profFileList = (from file in fileList
                    group file by new { file.fdate, fgroup = fileGroupMap[file.ffilename] } into fng
                    where fng.Key.fgroup.All(f => fng.Select(fn => fn.ffilename).Contains(f))
                    from fn in fng
                    select fn.fname).ToList();

由于您没有保留组,因此我将查询末尾的组展平为一个待处理文件列表。如果您需要,可以将它们分组并处理组。

注意:如果存在不属于任何组的文件,您将从fileGroupMap中的查找中收到错误消息。如果这是可能的,您可以将fileList过滤为仅知道名称,如下所示:

var fileList = from fname in GetFiles
               let ffilename = Path.GetFileNameWithoutExtension(fname).ToUpper()
               where fileGroupMap.Keys.Contains(ffilename)
               select new {
                   fname,
                   fdate = Path.GetExtension(fname),
                   ffilename
               };

另请注意,在多个组中使用名称会导致fileGroupMap的创建出错。如果这是可能的,那么查询将变得更加复杂,并且必须以不同的方式编写。

答案 3 :(得分:0)

这是一个简单的类

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            string[] filenames = { "fg1a.12012017", "fg1b.12012017", "fg1c.12012017", "fg2a.12012017", "fg2b.12012017", "fg2c.12012017", "fg2d.12012017" };
            new SplitFileName(filenames);
            List<List<SplitFileName>> results = SplitFileName.GetGroups(); 
        }
    }
    public class SplitFileName
    {
        public static List<SplitFileName> names = new List<SplitFileName>(); 
        string filename { get; set; }
        string prefix { get; set; }
        string letter { get; set; }
        DateTime date { get; set; }


        public SplitFileName() { }
        public SplitFileName(string[] splitNames)
        {
            foreach(string name in splitNames)
            {
                SplitFileName splitName = new SplitFileName();
                names.Add(splitName);
                splitName.filename = name;
                string[] splitArray = name.Split(new char[] { '.' });
                splitName.date = DateTime.ParseExact(splitArray[1],"MMddyyyy", System.Globalization.CultureInfo.InvariantCulture);
                splitName.prefix = splitArray[0].Substring(0, splitArray[0].Length - 1);
                splitName.letter = splitArray[0].Substring(splitArray[0].Length - 1,1);
            }
        }
        public static List<List<SplitFileName>> GetGroups()
        {
            return names.OrderBy(x => x.letter).GroupBy(x => new { date = x.date, prefix = x.prefix })
                .Where(x => string.Join(",",x.Select(y => y.letter)) == "a,b,c,d")
                .Select(x => x.ToList())
                .ToList();
        }

    }
}

答案 4 :(得分:0)

在大家的帮助下,我也解决了这个问题。这就是我要去的,因为它对我来说是最易维护的,但解决方案非常聪明!谢谢大家的帮助。

   private void CheckFiles()
    {
        var fileGroups = new[] {
          new [] { "FG1A", "FG1B", "FG1C", "FG1D" },
          new[] { "FG2A", "FG2B", "FG2C", "FG2D", "FG2E" } };


        List<string> fileDates = new List<string>();
        List<string> pfiles = new List<string>();

        // get a list of file dates
        foreach (string fn in fileList)
        {
            string dateString = fn.Substring(fn.IndexOf('.'), 9);
            if (!fileDates.Contains(dateString))
            {
                fileDates.Add(dateString);
            }
        }

        // check if a date has all the files
        foreach (string fd in fileDates)
        {
            int fgCount = 0;
            // for each file group
            foreach (Array masterfg in fileGroups)
            {
                foreach (string fg in masterfg)
                {
                    // see if all the files are there
                    bool foundIt = false;
                    string finder = fg + fd;
                    foreach (string fn in fileList)
                    {
                        if (fn.ToUpper().Contains(finder))
                        {
                            pfiles.Add(fn);
                        }

                    }
                    fgCount++;

                }
                if (fgCount == pfiles.Count())
                {
                    foreach (string fn in pfiles)
                    {
                        procFileList.Add(fn);
                    }
                    pfiles.Clear();
                }
                else
                {
                    pfiles.Clear();
                }
            }
        }

        return;
    }