在C#中写入文件时添加额外的零

时间:2016-07-05 18:39:07

标签: c# regex datetime writetofile tryparse

我有一个包含某些记录的csv文件。在这些记录中是各种格式的日期。我想将所有格式转换为MM / dd / yyyy,其中任何单个数字月或日前都有0。问题是,当它写入文件时,它会添加一堆额外的0,我无法弄清楚原因。我的数据的一个例子是:

Title,Labels,Type,Current State,Created at,Accepted at,Deadline,Requested By,Description,Owned By,Comment,Comment,Comment,Comment,Comment,Comment,Comment,Comment,Comment,Comment,Comment,Comment,Comment,Comment,Comment,Comment,Comment,Comment,Comment,Comment,Comment,Comment,Comment,Comment,Comment,Comment,Comment,Comment,Comment,Comment,Comment,Comment,Comment,Comment,Comment,Comment,Comment,Comment,Comment,Comment
pad,pad,epic,,9/26/2012 0:00,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
655656 add security role xxxx,user updates,chore,accepted,7/20/2012 0:00,7/23/2012 0:00,,xxxx,"Call Number: 655656 
Client Name: xxxxx
Department: 
Address: xxxx
Phone: (xxx)xxx-xxxx
Open Date/Time: 6/25/2012 2:50:52 PM
Opened by: MAGIC 

Problem Description: Effective Date: 07/09/2012 12:00 a       
Area: CASE COMPASS.
Action: ADD ACCESS
Report/other Role: NONE
App Role: FIELD()

xxxx 7/18/2012 9:17 AM: created user id and assigned roles in enterprise security 

Notes:  

Problem Resolution: 7/19/12 - xxxx: Access granted, AD account added to the HL_Viewer security group.

CDS\xxxx -- S-1-5-21-508124448-3695470602-466989033-155771 

Magic URL:  http://magicweb02/magictsd 
",Jane Doe, Please verify (Jane Doe - 07/23/2012 0:00),verified (Jamie Doe -07/23/2012 00:00),,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
655977 add security role xxxx,user updates,chore,accepted,7/19/2012 0:00,7/23/2012 0:00,,xxx,"Call Number: 655977 

我的代码如下所示:

try
{
    string file = File.ReadAllText("C:\\\\Users\\hacknj\\Desktop\\mo_daily_activity_20160627_1412.csv"); 

    // Define bad date                
    Regex badDate = new Regex(@"(\d{1,2}\/\d{1,2}\/\d{4})");

    // Find Matches
    MatchCollection matches = badDate.Matches(file);

    // Go through each match
    foreach (Match match in matches)
    {
        // get the match text
        string matchText = match.Groups[0].ToString();                    

        // Define DateTime
        DateTime parsedDate;

        DateTime.TryParse(matchText.Trim(), out parsedDate);

        file = file.Replace(matchText, parsedDate.ToString("MM/dd/yyyy"));                    
    }
    File.WriteAllText("C:\\\\Users\\hacknj\\Desktop\\TestFile.csv", file);
} 

这里有一些日期在写入文件后的样子:

pad,pad,epic,,000009/26/2012 0:00,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
655656 add security role xxxx,user updates,chore,accepted,0000007/20/2012 0:00,00000007/23/2012 0:00,,xxxx,"Call Number: 655656 

如果我在更换之前查看数据,它看起来很好。我通过

这样做
MessageBox.Show("Match Text: " + matchText.Trim() + "\nParsed Date: " + parsedDate.ToString("MM/dd/yyyy"));

有人能告诉我我在做什么导致在写入文件时产生这些额外的0?

2 个答案:

答案 0 :(得分:4)

额外的零是这一行在循环中运行的结果:

file = file.Replace(matchText, parsedDate.ToString("MM/dd/yyyy"));

如果同一日期在文件中出现多次,则每次正则表达式找到一行时,上述行都会替换所有。因此,如果日期需要前导零,则每次该行运行时,所有匹配日期都会获得新的前导零。

相反,您可以使用Regex.Replace()MatchEvaluator函数重新格式化匹配的日期:

var newFile = Regex.Replace(file, @"(\d{1,2}\/\d{1,2}\/\d{4})", m =>
{
    string matchText = m.Groups[0].ToString();
    DateTime parsedDate;
    if (DateTime.TryParse(matchText.Trim(), out parsedDate))
    {
        return parsedDate.ToString("MM/dd/yyyy");
    }
    else
    {
        return matchText;
    }
});

File.WriteAllText("C:\\\\Users\\hacknj\\Desktop\\TestFile.csv", newFile);

答案 1 :(得分:3)

更改

  • Regex badDate = new Regex(@"(\d{1,2}\/\d{1,2}\/\d{4})");
  • Regex badDate = new Regex(@"\d{1,2}\/\d{1,2}\/\d{4}");(删除括号)。

更改

  • string matchText = match.Groups[0].ToString();
  • string matchText = match.Groups[0].Captures.ToString();

此外,如果您想捕获日,月和年。它将在紧要关头完成工作。无需在循环中进行替换(无论如何字符串都是不可变的,所以这是一个坏主意)。您不必担心int.Parse抛出异常,因为函数体所涵盖的只会在内容与您定义的模式匹配时执行(2位,2位,2位或4位)

Regex badDate = new Regex(@"(?<Month>\d{1,2})\/(?<Day>\d{1,2})\/(?<Year>(20)?\d{2})");

File.WriteAllText(
    path, 
    badDate.Replace(
        file, 
        m => { 
            var year  = int.Parse(m.Groups["Year"].Value);
            var month = int.Parse(m.Groups["Month"].Value);
            var day   = int.Parse(m.Groups["Day"].Value);
            if (year < 2000) year += 2000;
            var datetime = new DateTime(year, month, day);
            return datetime.ToString("MM/dd/yyyy");
        }
    )
);

(?<NamedGroup>RegexPattern)语法使调试变得容易一些。消费代码更容易阅读。它仍然是正则表达式,但它总比没有好。我改变了你的年份模式,可选择接受20后跟2位数。这应该涵盖2000年到2099年之间的2或4位数年份。根据需要进行调整。我向你的祖先道歉,因为迫在眉睫的y2100错误。