读取文本文件并获取日期值

时间:2019-03-20 23:53:07

标签: c# readfile

有没有一种简单的方法来查找包含日期时间的行。

到目前为止,我可以阅读文本文件,下一步是解析它,但是在此之前,我认为我需要一些指导。这是我当前的阅读脚本:

List<string> Temp = new List<string>();            
string[] filePaths = Directory.GetFiles(@"C:\\Temp\\", "*.txt");

foreach (string files in filePaths)
{
    var fileStream = new FileStream(files, FileMode.Open, FileAccess.Read);
    using (var streamReader = new StreamReader(fileStream, Encoding.UTF8))
    {
        Temp.Add(streamReader.ReadToEnd());
    }
}

foreach (string i in Temp)
{
    if (i.Contains("Events"))
    {
        Console.WriteLine(i);        
    }
}

这是我需要解析从该工具生成的示例文本模板。

"[Output]"
"[Events]"
"Time"  "Duration"  "Severity"  "Event" "Text1" "Text2"


"[Acquisition Settings_1]"
"Data Set"  "DataSet1"
"Data Stream"   "Data"


"[Scan Data (Pressures in Torr)]"
"Time"  "Scan"  "Mass 1"    "Mass 2"    "Mass 3"    
"10/25/2018 4:59:27 PM" 1   5.5816e-008 1.3141e-008 -1.6109e-010    
"10/25/2018 4:59:35 PM" 2   5.5484e-008 1.3403e-008 6.9720e-010 
"10/25/2018 4:59:41 PM" 3   5.5633e-008 1.3388e-008 8.8094e-011 
"10/25/2018 4:59:48 PM" 4   5.7289e-008 1.2343e-008 1.4095e-010 
"10/25/2018 4:59:54 PM" 5   5.2841e-008 1.3219e-008 7.5257e-010 

"10/25/2018 4:59:57 PM" "After Calibration due to marginal data of daily pm3 rga checking"  
"10/25/2018 5:49:51 PM" "RGA Base Pressure
Flat pallet (2018-10-25_011_a1a)"   
"10/25/2018 6:21:53 PM" "PM3 SiNFILL_27A
2018-10-25_011_A4A" 
"10/25/2018 9:51:29 PM" "IBE1 STEP
FULL TAPE
NO PRE-BAKE"    
"10/25/2018 9:58:48 PM" "IBE2 STEP

这是我的目标,或者期望的结果是获得具有datetime值的行:

"10/25/2018 4:59:27 PM" 1   5.5816e-008 1.3141e-008 -1.6109e-010    
"10/25/2018 4:59:35 PM" 2   5.5484e-008 1.3403e-008 6.9720e-010 
"10/25/2018 4:59:41 PM" 3   5.5633e-008 1.3388e-008 8.8094e-011 
"10/25/2018 4:59:48 PM" 4   5.7289e-008 1.2343e-008 1.4095e-010 
"10/25/2018 4:59:54 PM" 5   5.2841e-008 1.3219e-008 7.5257e-010 

任何建议TIA。

1 个答案:

答案 0 :(得分:1)

您可能(暂时)摆脱了类似 Pattern 这样的内容。它考虑了否定的扩展符号,并且还具有原始格式的标签(示例中未显示)

.of{overflow:auto}

*注意:我不会写正则表达式说明,因为它太长了

示例

^""\d+/\d+/\d+ \d+:\d+:\d+ (AM|PM)""\s+-?\d+\s+\d+.?\d+e-\d+

但是,要更进一步,您可以执行以下操作。这会将所有解析的数据放入一个类中。

给出

var pattern = @"^""\d+/\d+/\d+ \d+:\d+:\d+ (AM|PM)""\s+-?\d+\s+\d+.?\d+e-\d+";
var regex = new Regex(pattern, RegexOptions.Compiled);

var filePaths = Directory.GetFiles(@"C:\Temp", "*.txt");

var results = new List<string>();

foreach (var file in filePaths)
{
   var lines = File.ReadLines(@"D:\sample.txt").Where(x => regex.IsMatch(x));
   results.AddRange(lines);
}

示例

public class ScanData
{
   public DateTime Time { get; set; }
   public int Scan { get; set; }
   public decimal?[] MassResults  { get; set; }

   public static ScanData FromString(string data)
   {
      var split = data.Split('\t');

      decimal? Local(string value)
      {
         return decimal.TryParse(value, NumberStyles.Float, null, out var output) ? output : (decimal?)null;
      }

      var scanData = new ScanData()
                     {
                        Time = DateTime.ParseExact(split[0].Trim('"'), "M/d/yyyy h:m:s tt", null),
                        Scan = int.Parse(split[1]),
                        MassResults = split.Skip(2).Select(Local).ToArray()
                     };

      return scanData;
   }

}