Question

我想从数据框中提取以下模式：

Mar 20th, 2009; Mar 21st, 2009; Mar 22nd, 2009

我写了下面的代码来提取它：

d4=df.str.extractall(r'((?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[a-z][?:]*)((?:\d{1,2}(?:th|st|nd|rd)[,?:])\d{4})')

不幸的是，它无法提取任何内容。

Answer 1

我假设你的日期格式只有：MMM DDst / nd / rd / th，YYYY，因此2009年3月1日而不是2009年3月1日。以下正则表达式应该运行良好。 \b(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) (?:[0-3][1]st|[0-2][2]nd|[0-2][3]rd|[1-3][0]th|[0-2][4-9]th), \d{4}

Python Regex Demo

Answer 2

我看到你的模式存在多个问题/疑问，所以我从一开始就改写它：

(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\s+\d{1,2}(?:th|st|nd|rd),\s+\d{4}

以下是对模式的解释：

(?:Jan|Feb|...|Dec)    match, but do not capture, the abbreviated month name
\s+                    one or more spaces
\d{1,2}                day as one or two digits
(?:th|st|nd|rd)        match, but do not capture, day quantifier
\s+                    one or more spaces
\d{4}                  match a four digit year

完整代码：

my_str = 'Mar 20th, 2009; Mar 21st, 2009; Mar 22nd, 2009'

match = re.findall(r'(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\s+\d{1,2}(?:th|st|nd|rd),\s+\d{4}', my_str)

for item in match:
    print(item)

Demo

Answer 3

它需要一些空格。

((?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec))\s+((?:\d{1,2}(?:th|st|nd|rd)[,?:])\s+\d{4})

 (                             # (1 start)
      (?:
           Jan
        |  Feb
        |  Mar
        |  Apr
        |  May
        |  Jun
        |  Jul
        |  Aug
        |  Sep
        |  Oct
        |  Nov
        |  Dec
      )
 )                             # (1 end)
 \s+ 
 (                             # (2 start)
      (?:
           \d{1,2} 
           (?: th | st | nd | rd )
           [,?:] 
      )
      \s+ 
      \d{4} 
 )                             # (2 end)

Answer 4

您可以使用re.split。

正则表达式：;\s

详细说明：

\s匹配任何空格字符

Python代码：

def Split(text):
        return re.split(r';\s', text)

print(Split("Mar 20th, 2009; Mar 21st, 2009; Mar 22nd, 2009"))

输出：

['Mar 20th, 2009', 'Mar 21st, 2009', 'Mar 22nd, 2009;']

Code demo

在Python中使用Regex提取模式

4 个答案:

Demo