如何使用C#从.txt文件中提取特定段落?

时间:2017-09-20 04:28:02

标签: c#

我有一个.txt文件,其中包含世界上所有机场的信息。我需要使用C#来读取每个机场的一些参数。

在.txt文件中,格式如下:

Airport ZSPD Latitude... Longitude...
      ...(multiple lines)
      (one blank line here)
      ...(multiple lines)
      (another blank line here)
Airport ZSSS Latitude... Longitude...
      ... 

现在我需要提取一个机场的段落来阅读。有任何想法吗?谢谢!

一个完整的例子是这样的:(以前它已超过600行,我用&#34取代了类似的行; ......")

Airport KPAE :N47:54:22.8182  W122:16:53.6295  606ft
      Country Name="United States"
      State Name="Washington"
      City Name="Everett"
      Airport Name="Snohomish Co"
      in file: Scenery\0101\scenery\APX15140.bgl

      Runway 16R/34L centre: N47:54:32.3097  W122:17:08.1038  606ft
          Start 16R: N47:55:15.2970  W122:17:09.0664  606ft Hdg: 179.1T, Length 9004ft 
          Computed start 16R: Lat 47.921322 Long -122.285860
          Start 34L: N47:53:49.3225  W122:17:07.1430  606ft Hdg: 359.1T, Length 9004ft 
          Computed start 34L: Lat 47.896626 Long -122.285307
          Hdg: 179.140 true (MagVar 19.700), Asphalt, 9004 x 150 ft
          Primary ILS ID = IPAE
          Primary ILS: IPAE  109.30 Hdg: 179.1 , Flags: GS BC "ILS 16R"
          *** Runway *** KPAE0162 Lat 47.921322 Long -122.285858 Alt 606 Hdg 159 Len 9004 Wid 150 ILS 109.30, Flags: GS BC
          *** Runway *** KPAE0341 Lat 47.896626 Long -122.285309 Alt 606 Hdg 339 Len 9004 Wid 150
      Runway 11 /29  centre: N47:54:21.4900  W122:16:47.8649  606ft
          Start 11 : N47:54:36.4886  W122:17:10.8542  606ft Hdg: 134.2T, Length 4508ft 
          Computed start 11 : Lat 47.910278 Long -122.286575
          Start 29 : N47:54:06.4915  W122:16:24.8759  606ft Hdg: 314.2T, Length 4508ft 
          Computed start 29 : Lat 47.901657 Long -122.273346
          Hdg: 134.180 true (MagVar 19.700), Asphalt, 4508 x 75 ft
          *** Runway *** KPAE0110 Lat 47.910278 Long -122.286575 Alt 606 Hdg 114 Len 4508 Wid 75
          *** Runway *** KPAE0290 Lat 47.901657 Long -122.273346 Alt 606 Hdg 294 Len 4508 Wid 75
      Runway 16L/34R centre: N47:54:08.3055  W122:16:17.9493  606ft
          Start 16L: N47:54:22.6238  W122:16:18.1134  606ft Hdg: 179.6T, Length 2997ft 
          Computed start 16L: Lat 47.906414 Long -122.271699
          Start 34R: N47:53:53.9872  W122:16:17.7866  606ft Hdg: 359.6T, Length 2997ft 
          Computed start 34R: Lat 47.898197 Long -122.271605
          Hdg: 179.560 true (MagVar 19.700), Asphalt, 2997 x 75 ft
          *** Runway *** KPAE0161 Lat 47.906414 Long -122.271698 Alt 606 Hdg 160 Len 2997 Wid 75
          *** Runway *** KPAE0342 Lat 47.898197 Long -122.271606 Alt 606 Hdg 340 Len 2997 Wid 75
      COM: Type=10 (CENTRE), Freq=128.50, Name="SEATTLE"
      COM: Type=13 (ASOS), Freq=128.65, Name=""
      ...
      Taxipoint #0, type 5 (?):  N47:54:07.1069  W122:16:23.6608  -- Forward
      Taxipoint #1, type 5 (?):  N47:54:07.1069  W122:16:28.0026  -- Reverse
      ...
      Parking Park1 [#G0]:  N47:54:10.2492  W122:16:42.0037
          Type 2 (GA Ramp Small), Size 10.0m, Hdg 225.1T
      Parking Park2 [#G1]:  N47:54:09.2126  W122:16:40.5185
          Type 3 (GA Ramp Medium), Size 14.0m, Hdg 225.1T
      ...
      Gate P11 [#G7]:  N47:55:09.3365  W122:16:49.9781
          Type 10 (Heavy Gate), Size 36.0m, Hdg 90.0T
      Gate P10 [#G8]:  N47:55:03.7646  W122:16:50.0650
          Type 9 (Medium Gate), Size 23.0m, Hdg 90.0T
      ...
      Parking Park8 [#G14]:  N47:53:50.9422  W122:16:43.1112
          Type 3 (GA Ramp Medium), Size 14.0m, Hdg 179.1T
      Parking Park9 [#G15]:  N47:53:50.8774  W122:16:47.3646
          Type 3 (GA Ramp Medium), Size 14.0m, Hdg 179.1T
      ...
      Gate G12 [#G19]:  N47:54:00.1422  W122:16:45.1389
          Type 8 (Small Gate), Size 18.0m, Hdg 90.7T
      Gate G13 [#G20]:  N47:54:00.1422  W122:16:55.6108
          Type 9 (Medium Gate), Size 23.0m, Hdg 270.0T
      ...
      Parking Park13 [#G34]:  N47:54:15.4971  W122:16:47.2985
          Type 12 (?), Size 16.0m, Hdg 314.7T
      Parking Park14 [#G35]:  N47:54:13.6830  W122:16:42.5816
          Type 3 (GA Ramp Medium), Size 14.0m, Hdg 43.6T
      ...
      Taxipath (Name #0):  Type 3 (Parking), Start#=24, End#=G0, Wid=21.34m
      Taxipath (Name #0):  Type 3 (Parking), Start#=25, End#=G1, Wid=21.34m
      ...
      Taxiname:  #0 = 
      Taxiname:  #1 = A1
      ...
    TaxiWay : G0-24-25-G1
    TaxiWay : G2-26-81-80-79-1-78
    ...
      FSM A/P KPAE, lat=47.906342, long=-122.281563, alt=606

我需要阅读的是机场ID,每个跑道,登机口和停车场(一些机场没有盖茨和停车场)的高度,纬度,经度和Hdg,每条跑道的长度和ILS。

我需要将它们读入DataTable - 因此,每个跑道,登机口和停车场的信息占据DataTable中的1行。

我应该使用哪种方式提取我想要的数据,并在此.txt文件中区分机场与其他机场?我需要一些想法。非常感谢你!

2 个答案:

答案 0 :(得分:0)

最佳方法是将文本文件转换为更好的格式化文件类型,例如.xml

String[] data = File.ReadAllLines("yourTextFileName.txt");
XElement root = new XElement("root",
                            from item in data
                            select new XElement("Line",item));
root.Save("SaveFileName.Xml");

然后阅读XML标签

var xmlStr = File.ReadAllText("SaveFileName.xml");
var str = XElement.Parse(xmlStr);

var result = str.Elements("word").Where(x => x.Element("Airportname").Value.Equals("Hethro")).ToList();

Console.WriteLine(result);

答案 1 :(得分:0)

我的方法是使用这样的正则表达式:

var txt = @"Airport ZSPD Latitude... Longitude...
  ...(multiple lines)
  ...(multiple lines)

  ...(multiple lines)
  ...(multiple lines)
  ...(multiple lines)

Airport ZSPD Latitude... Longitude...
  ...(multiple lines)
  ...(multiple lines)

  ...(multiple lines)
  ...(multiple lines)
  ...(multiple lines)";

var result =
    Regex.Matches(
        txt,
        @"^airport\s+(?<ap>\S+)\s+(?<lat>\S+)\s+(?<lng>\S+)\s+(?<p1>[\s\S]+?)^\s*$(?<p2>[\s\S]+?)(^\s*$|(?!\s)$)",
        RegexOptions.Multiline | RegexOptions.IgnoreCase | RegexOptions.CultureInvariant)
        .OfType<Match>()
        .Select(
            c =>
            new
                {
                    Airport = c.Groups["ap"].Value,
                    Latitude = c.Groups["lat"].Value,
                    Longitude = c.Groups["lng"],
                    Paragraph1 = c.Groups["p1"],
                    Paragraph2 = c.Groups["p2"]
                })
        .ToList();

foreach (var item in result)
{
    Console.WriteLine(item);
}

C# demo