如何在python中使用正则表达式形成单独的块?

时间:2017-06-14 09:57:16

标签: python regex resume

这是我的代码:

   <LinearLayout
    android:layout_width="match_parent"
    android:layout_height="70dp"
    android:background="#9294a3">

        <TextView
        android:id="@+id/tv_2"
        android:layout_width="wrap_content"
        android:text="2"
        android:textColor="#c2bbbb"
        android:textSize="30sp"
        android:background="@drawable/border"
        android:layout_height="wrap_content" />

    </LinearLayout

I / P:

基本资料

姓名:John

电话号码:+ 91-9876543210

DOB:21-10-1995

技能组合

爪哇

的Python

O / P: (&#39;基本信息&#39;,&#39;名称:John&#39;) (&#39;技能集&#39;,&#39; Java&#39;)

但要求o / p: (&#39;基本信息&#39;,&#39;姓名:John&#39;,&#39;电话号码:+ 91-9876543210&#39;,&#39; DOB&#39;:&#39; ; 21-10-1995&#39;) (&#39;技能集&#39;,&#39; Java&#39;,#39; Python&#39;)

2 个答案:

答案 0 :(得分:0)

re.MULTILINE替换为re.DOTALL,以便您的.*匹配多行(是的,标记名称有些误导)。您还需要将结果字符串拆分为\n

通常情况下,使用regexp执行此任务并不是最好的主意,这应该更好:

import string
results = []
for line in inputfile.splitlines():
  if all(c in (string.ascii_uppercase + ' ') for c in line):
    results.append([ line ])
  elif line != '':
    results[-1].append(line)

答案 1 :(得分:0)

It is tough to get all output with regex cause your file text is not simple.

But regex + little extra effort and you can achive this easily

# This regex fetch all Titles (i.e. BASIC INFO, SKILL SET...)
results = re.findall(r"([A-Z ]{4,})", inputfile) 

And After little work will help you to get your desired result

items=[]
for z in results:
    item = inputfile[:inputfile.index(z)]
    inputfile = inputfile.replace(item,'')
    if item:
      items.append(filter(str,item.split('\n')))
items.append(filter(str,inputfile.split('\n')))
print items


OUTPUT:
[ ['BASIC INFORMATION', 'Name: John', 'Phone No.: +91-9876543210', 'DOB': '21-10-1995'],
     ['SKILL SET', 'Java',' Python']
]