C#正则表达式匹配地址 - 标记组可选

时间:2016-09-27 10:46:45

标签: c# regex

我需要解析一个德语地址,我得到一个字符串,例如" Example Street 5b"。我想将它分组:街道,号码和附加信息。

例如:address = Test Str。图5b

- >街:" Test Str。"编号:" 5",添加:" b"

我的代码看起来像这样:

string street = "";
string number = "";
string addition = "";
//this works:
string address = "Test Str. 5b";
//this doesn't match, but I want it in the street group:
//string address = "Test Str.";        
Match adressMatch = Regex.Match(address, @"(?<street>.*?\.*)\s*(?<number>[1-9][0-9]*)\s*(?<addition>.*)");

street = adressMatch.Groups["street"].Value;
number = adressMatch.Groups["number"].Value;
addition = adressMatch.Groups["addition"].Value;

该代码适用于示例和大多数其他情况。

我的问题:

如果地址不包含数字,则该功能失败。我试着添加*?在数字组和其他几个东西之后,然后整个字符串被解析为&#34;添加&#34;和&#34;街道&#34;和&#34;数字&#34;保持空虚。但如果缺少这个数字,我希望字符串解析为&#34; street&#34;和&#34;数字&#34;和&#34;添加&#34;将保持空白。

提前致谢:)

1 个答案:

答案 0 :(得分:0)

我会这样做:我将街道匹配到 street 组,然后将数字 - 如果有的话 - 匹配到 number 组,然后其余的进入添加组。

然后,如果数字组不成功,则应将添加值移至 number 组,这可以在C#代码中轻松完成。

所以,使用

(?<street>.*\.)(?:\s*(?<number>[1-9][0-9]*))?\s*(?<addition>.*) 
            ^^  ^^                         ^^

请参阅regex demo here(请注意更改:第一个.*?变为贪婪,*后的\.量词被删除,组与前面的\s*一起可选。

然后,使用此逻辑(C# sample snippet):

string street = "";
string number = "";
string addition = "";
//string address = "Test Str. 5b"; // => Test Str. |  5  |  b
string address = "Test Str. b"; // => Test Str. |  b  |  
Match adressMatch = Regex.Match(address, @"(?<street>.*\.)(?:\s*(?<number>[1-9][0-9]*))?\s*(?<addition>.*)");
if (adressMatch.Success) {
    street = adressMatch.Groups["street"].Value;
    addition = adressMatch.Groups["addition"].Value;
    if (adressMatch.Groups["number"].Success)
        number = adressMatch.Groups["number"].Value;
    else 
    {
        number = adressMatch.Groups["addition"].Value;
        addition = string.Empty;
    }
}
Console.WriteLine("Street: {0}\nNumber: {1}\nAddition: {2}", street, number, addition);