解析制表符分隔文件检测第一行值是否为空/制表符

时间:2013-03-04 23:21:25

标签: c#-3.0

大家好我要解析一些文件加载​​到DataSet中我遇到了一个问题,第一行值有时是空白的,所以当我解析数据时,添加到列的行是关闭的,因为没有值行[RouteCode]。

示例数据 列位于第一行(制表符分隔)DataRows位于以下行(制表符分隔)
RouteCode City EmailAddress FirstName
NULL MyCity My-Email MyFirstName

我所看到的是所有列都添加正常,但每行添加第一个标签值未检测到因此它会移动列(希望我有意义)所以在这种情况下,城市数据位于RouteCode列中并且最后一列以某种方式获得第一行值(tab)。

class TextToDataSet
{
    public TextToDataSet()
    { }

    /// <summary>
    /// Converts a given delimited file into a dataset. 
    /// Assumes that the first line    
    /// of the text file contains the column names.
    /// </summary>
    /// <param name="File">The name of the file to open</param>    
    /// <param name="TableName">The name of the 
    /// Table to be made within the DataSet returned</param>
    /// <param name="delimiter">The string to delimit by</param>
    /// <returns></returns>  
    public static DataSet Convert(string File,
     string TableName, string delimiter)
    {
        //The DataSet to Return
        DataSet result = new DataSet();

        //Open the file in a stream reader.
        using (StreamReader s = new StreamReader(File))
        {
            //Split the first line into the columns       
            string[] columns = s.ReadLine().Split(delimiter.ToCharArray());
            //Add the new DataTable to the RecordSet
            result.Tables.Add(TableName);
            //Cycle the colums, adding those that don't exist yet 
            //and sequencing the one that do.
            foreach (string col in columns)
            {
                bool added = false;
                string next = "";
                int i = 0;
                while (!added)
                {
                    //Build the column name and remove any unwanted characters.
                    string columnname = col + next;
                    columnname = columnname.Replace("#", "");
                    columnname = columnname.Replace("'", "");
                    columnname = columnname.Replace("&", "");
                    //See if the column already exists
                    if (!result.Tables[TableName].Columns.Contains(columnname))
                    {
                        //if it doesn't then we add it here and mark it as added
                        result.Tables[TableName].Columns.Add(columnname);
                        added = true;
                    }
                    else
                    {
                        //if it did exist then we increment the sequencer and try again.
                        i++;
                        next = "_" + i;
                    }
                }
            }
            //Read the rest of the data in the file.        
            string AllData = s.ReadToEnd();
            //Split off each row at the Carriage Return/Line Feed
            //Default line ending in most windows exports.  
            //You may have to edit this to match your particular file.
            //This will work for Excel, Access, etc. default exports.
            string[] rows = AllData.Split("\n".ToCharArray());
            //Now add each row to the DataSet        
            foreach (string r in rows)
            {

                //Split the row at the delimiter.
                string[] items = r.Split(delimiter.ToCharArray());
                //Add the item
                result.Tables[TableName].Rows.Add(items);
            }
        }

        //Return the imported data.        
        return result;
    }
}
}

1 个答案:

答案 0 :(得分:0)

如果文件中的任何地方不应该有任何遗漏的条目(即标签之间应该总是有东西),那么你可以使用:

string[] columns = s.ReadLine().Split(delimiter.ToCharArray(), StringSplitOptions.RemoveEmptyEntries);

然后检查columns是不是一个空数组。如果是,则读取下一行并继续处理:

while (columns.Length == 0)
{
    // Row is empty so read the next line out of the file
    columns = s.ReadLine().Split(delimiter.ToCharArray(), StringSplitOptions.RemoveEmptyEntries);
}

这将确保您的数据始终以填充的行开头。但是,如果列表中有一个空条目,它将会崩溃。

如果可能有空条目,那么您可能必须检查所有列是否为空:

while (columns.All(c => string.IsNullOrEmpty(c)))
{
    // Row is empty so read the next line out of the file
    columns = s.ReadLine().Split(delimiter.ToCharArray());
}