Question

我正在尝试使用python中的regex从大字符串中提取一个小子字符串，如果在字符串中找到另一个关键字。

e.g。 -

s = "1  0001    1   UG  science,ee;YEAR=onefour;standard->2;district->9"

if "year" in s:
    print ("The year is = ",VALUE_OF_YEAR)<--- here I hope to somehow get the year substring from the above string and print it.

即。答案看起来像

The year is = onefour

请注意 - 如果值表示不同的数字，例如onethree，oneseven等，则值会发生变化

我基本上想要复制从

开始的任何内容

直到

如果我找到

YEAR

字符串中的

并将其打印出来

我不太清楚如何做到这一点。

我尝试在python中使用字符串操作方法，但到目前为止，我还没有找到任何方法来准确地复制所有单词，直到＆＃39 ;;＆＃39;在字符串中。

任何帮助将不胜感激。任何其他方法也欢迎。

Answer 1

您还可以saving group抓取year值：

>>> import re
>>> 
>>> pattern = re.compile(r"YEAR=(\w+);")
>>> s = "1  0001    1   UG  science,ee;YEAR=onefour;standard->2;district->9"
>>> pattern.search(s).group(1)
'onefour'

您可能还需要在没有匹配项时处理案例。例如，返回None：

import re

def get_year_value(s):
    pattern = re.compile(r"YEAR=(\w+);")
    match = pattern.search(s)

    return match.group(1) if match else None

Answer 2

您可以使用正则表达式来获取该值：

(?<=\bYEAR=)[^;]+

正则表达式匹配：

(?<=\bYEAR=)如果我们要查找的字符串前面有一个完整的单词YEAR= ...
[^;]+ - 匹配;以外的1个或多个字符。

这是a regex demo

以下是sample Python code：

import re
p = re.compile(r'(?<=\bYEAR=)[^;]+')
test_str = "1  0001    1   UG  science,ee;YEAR=onefour;standard->2;district->9"
robj = re.search(p, test_str)
if robj:
    print(robj.group(0))

如果每个人都非常喜欢捕捉群体，那么同样的表达方式会将后卫替换为捕捉群体：

\bYEAR=([^;]+)

在Python中：

p = re.compile(r'\bYEAR=([^;]+)')
test_str = "1  0001    1   UG  science,ee;YEAR=onefour;standard->2;district->9"
robj = re.search(p, test_str)
if robj:
    print(robj.group(1))

请注意，如果您的YEAR值包含连字符或其他非字字符，\w将无法帮助您。被否定的角色类是你最好的朋友。

Answer 3

这就是我使用的，

if "YEAR" in s:
    year= s.split('YEAR=')[1].split(';')[0]
    print ("The year is = " +year)
#this is the output
> The year is = onefour

基本上它正在做的是在YEAR=之后和;之前分割线。 [1]分割子字符串YEAR=的右侧，[0]分割子字符串;的左侧

Answer 4

YEAR=(?P<year>\w+);

这应该有用。

Answer 5

试试这个正则表达式：

".*(?=YEAR).*YEAR=(.*?);.*"g

替换/1

[Regex Demo]

如果找到另一个关键字，则从python中的一行中提取子字符串

5 个答案: