循环遍历文件集并检查它是否在C#中以逗号分隔

时间:2012-08-14 15:03:38

标签: c# ssis

我需要循环遍历一组文件,并检查它是否以逗号分隔在C#中。我是C#的新手。请帮助我。

提前致谢。

2 个答案:

答案 0 :(得分:0)

正如有人已经指出的那样,解决方案并不容易。 如果每个逗号分隔文件都有指定的扩展名(例如:csv),那么可能非常容易。如果不是,以下算法应该起作用:

  1. 检索指定目录中文件的所有名称(路径+名称)。如果需要,只过滤那些可能感兴趣的那些。提示:请查看System.IO.DirectorySystem.IO.File以及System.IO.DirectoryInfoSystem.IO.FileInfo
  2. 您必须检查每个文件,并检查它是否以逗号分隔。这将是一个棘手的部分。您可以构建一个正则表达式,它将检查文件的每一行,并告诉您它是否以逗号分隔。
  3. 正则表达式在开始时有点难学,但它应该在一段时间后回报。

答案 1 :(得分:0)

这是一个快速的控制台应用程序,它将获取目录,扫描目录中的所有文件,然后遍历它们并返回包含逗号-vs的行的百分比 - 文件中的总行数。正如已经指出的那样,您可以验证CSV库。这只是一个让你入门的简单例子。

要使用此功能,请在Visual Studio中创建一个新的Console App项目,并将其命名为“TestStub”,然后将其复制并复制到“Program.cs”文件中。

namespace TestStub
{
    using System;
    using System.IO;
    using System.Text;

    public class Program
    {
        private static char[] CSV = { ',', ',' };
        private static bool csvFound = false;
        /// <summary>
        /// This is the console program entry point
        /// </summary>
        /// <param name="args">A list of any command-line args passed to this application when started</param>
        public static void Main(string[] args)
        {
            // Change this to use args[0] if you like 
            string myInitialPath = @"C:\Temp";
            string[] myListOfFiles;

            try
            {
                myListOfFiles = EnumerateFiles(myInitialPath);

                foreach (string file in myListOfFiles)
                {
                    Console.WriteLine("\nFile {0} is comprised of {1}% CSV delimited lines.",
                        file, 
                        ScanForCSV(file));
                }

                Console.WriteLine("\n\nPress any key to exit.");
                Console.ReadKey();
            }
            catch (Exception ex)
            {
                Console.WriteLine(
                    "Error processing {0} for CSV content: {1} :: {2}", 
                    myInitialPath, 
                    ex.Message, 
                    ex.InnerException.Message);
            }
        }

        /// <summary>
        /// Get a list of all files for the specified path
        /// </summary>
        /// <param name="path">Directory path</param>
        /// <returns>String array of files (with full path)</returns>
        public static string[] EnumerateFiles(string path)
        {
            string[] arrItems = new string[1];

            try
            {
                arrItems = Directory.GetFiles(path);
                return arrItems;
            }
            catch (Exception ex)
            {
                throw new System.IO.IOException("EnumerateFilesAndFolders() encountered an error:", ex);
            }
        }
        /// <summary>
        /// Determines if the supplied file has comma separated values
        /// </summary>
        /// <param name="filename">Path and filename</param>
        /// <returns>Percentage of lines containing CSV elements -vs- those without</returns>
        public static float ScanForCSV(string filename)
        {
            //
            // NOTE: You should look into one of the many CSV libraries
            // available. This method will not carefully scruitinize
            // the file to see if there's a combination of delimeters or
            // even if it's a plain-text (e.g. a newspaper article)
            // It just looks for the presence of commas on multiple lines
            // and calculates a percentage of them with and without
            //
            float totalLines = 0;
            float linesCSV = 0;

            try
            {
                using (StreamReader sReader = new StreamReader(filename))
                {
                    int elements = 0;
                    string line = string.Empty;
                    string[] parsed = new string[1];

                    while (!sReader.EndOfStream)
                    {
                        ++totalLines;
                        line = sReader.ReadLine();
                        parsed = line.Split(CSV);
                        elements = parsed.Length;
                        if (elements > 1)
                        {
                            ++linesCSV;
                        }
                    }
                }
            }
            catch (Exception ex)
            {
                throw new System.IO.IOException(string.Format("Problem accessing [{0}]: {1}", filename, ex.Message), ex);
            }
            return (float)((linesCSV / totalLines) * 100);
        }
    }  
}

}