我有一个表格的句子(关键字后跟左括号后跟任意字符串后跟2个以连字符分隔的日期):
Mohandas Karamchand Gandhi (/ˈɡɑːndi, ˈɡæn-/; Hindustani: [ˈmoːɦənd̪aːs ˈkərəmtʃənd̪ ˈɡaːnd̪ʱi]; 2 October 1869 – 30 January 1948) was the preeminent leader of the Indian independence movement in British-ruled India.
我需要使用正则表达式从这句话中提取出生日期(1869年10月2日)和死亡日期(1948年1月30日)。我已经写了正则表达式来提取日期模式。
date_pattern="(\d{1,2}(\s|-|/)?(Jan(uary)?|Feb(ruary)?|Mar(ch)?|Apr(il)?|May?|June?|July?|Aug(ust)?|Sep(t(ember)?)?|Oct(ober)?|Nov(ember)?|Dec(ember)?|\d{1,2})(\s|-|/)?\d{2,4})"
我需要提取上述形式的句子,并分别打印出生日期和死亡日期。
答案 0 :(得分:1)
import re
text = '''Mohandas Karamchand Gandhi (/ˈɡɑːndi, ˈɡæn-/; Hindustani: [ˈmoːɦənd̪aːs ˈkərəmtʃənd̪ ˈɡaːnd̪ʱi]; 2 October 1869 – 30 January 1948) was the preeminent leader of the Indian independence movement in British-ruled India.'''
birth, death = re.findall(r'\d+[ \d\w]+', text)
print(birth)
print(death)
出:
2 October 1869
30 January 1948
答案 1 :(得分:0)
import re
date_pattern="(\d{1,2}(?:\s|-|/)?(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May?|June?|July?|Aug(?:ust)?|Sep(?:t(?:ember)?)?|Oct(?:ober)?|Nov(?:ember)?|Dec(?:ember)?|\d{1,2})(?:\s|-|/)?\d{2,4})"
bio = "Mohandas Karamchand Gandhi (/ˈɡɑːndi, ˈɡæn-/; Hindustani: [ˈmoːɦənd̪aːs ˈkərəmtʃənd̪ ˈɡaːnd̪ʱi]; 2 October 1869 – 30 January 1948) was the preeminent leader of the Indian independence movement in British-ruled India."
matches = re.findall(date_pattern, bio)
if matches and len(matches) > 1:
born = matches[0]
died = matches[1]
print("Born:", born)
print("Died:", died)