Python正则表达式:一组中的多组

时间:2015-07-22 12:09:14

标签: python regex

当匹配值在同一行时,我很难显示所有Python正则表达式模式。

我的文字文件(names.txt)的内容是:

Country: US
City: New York
Jessica is 15 years old single, and Edward is 27 years old single.
City: Boston
Daniel is 63 years old married, and Oscar is 102 years old divorsed.
Country: Canada
City: Sydney
David is 22 years old single, and Rebecca is 33 years old single.
City: Liverpool
Joe is 45 years old divorsed, and Alexander is 29 years old married.

我的python脚本是:

with open("names.txt") as f:
         text = f.read() pattern = re.compile(r'^(Country:.+?)\n^(City:.+?)\n([A-Z][a-z]*).*?(\d{1,3}).*?(single|married|divorsed)',re.MULTILINE)

     for m in re.finditer(pattern, text):
         print(m.group(1) + " > "+m.group(2) +" || "+ m.group(3)+": "+ m.group(4)+", "+ m.group(5))

结果:

Country: US > City: New York || Jessica: 15, single
Country: Canada > City: Sydney || David: 22, single

我希望捕获按城市和国家/地区分组的所有姓名,如下所示:

Country: US > City: New York || Jessica: 15, single - Edward: 27, single
Country: US > City: Boston || Daniel: 15, married - Oscar: 63, divorsed
Country: Canada > City: Sydney || David: 22, single - Rebecca: 33, single
Country: Canada > City: Liverpool|| Joe: 45, divorsed - Alexander: 29, married

1 个答案:

答案 0 :(得分:0)

我通过将列表转换为每场比赛的字符串找到了解决方案:

pattern = re.compile(r'(^(Country:.+?)\n((City:.+?)\n.*\n^(City:.+?)\n.*))',re.MULTILINE) 

    for m in re.finditer(pattern, text):         

        strall=m.group(3)
        strcountry = ''.join(strall)
        cities=re.compile(r'(City:.*)\n(([A-Z][a-z]*).*?(\d{1,3}).*?(single|married|divorsed).*)') #.* to find all next
        print(m.group(2))
        for m2 in re.finditer(cities, strcountry):            
            print(m2.group(1))
            names=re.compile(r'([A-Z][a-z]*).*?(\d{1,3}).*?(single|married|divorsed)')
            strnames = ''.join(m2.group())
            for m3 in re.finditer(names, strnames):
                sname=''.join(m3.group(1))
                sage=''.join(m3.group(2))
                sfamilystate=''.join(m3.group(3))
                print(sname+": "+sage+" > "+sfamilystate)
        print("-------------------------")

<强>输出:

Country: US
City: New York
Jessica: 15 > single
Edward: 27 > single
City: Boston
Daniel: 63 > married
Oscar: 102 > divorsed
-------------------------
Country: Australia
City: Sydney
David: 22 > single
Rebecca: 33 > single
City: Liverpool
Joe: 45 > divorsed
Alexander: 29 > married
-------------------------