Question

我要做的是匹配一个字符串＆＃34; A＆＃34;它在文本文件中多次出现，但我想在字符串＆＃34; B＆＃34;之前匹配它。出现。

例如 - 文本文件可能是：

(cell = "33"
Level = "2"

(
cGG Track
sample = "ThisSample"
Level = "201" )

(cGG Track
sample = "ThisOtherSample)
)

我想匹配level的值，但只在行＃34; sample =＆＃34;之前。＆＃34;＆＃34;发生。所以在上面的例子中，我想匹配＆＃34; 2＆＃34;。

如果示例如下：

(
ParamFigit = "3e"

(cGggTrack
sample = "ex"
Level = "3")
)

我根本不想要关卡，我只想把它设为0。

我用：

levelRegex = re.compile(r'Level = "(.*)"')
levelMatch = levelRegex.findall(MyText)

我用它来获取等级的值。我遇到的问题是我得错了。我不能说只是得到正则表达式的第一场比赛，因为它不会总是发生在＆＃34; sample =＆＃34; x＆＃34;＆＃34;

任何帮助都会很棒！

Answer 1

逐行读取文件：

      <plugin>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-maven-plugin</artifactId>
        <configuration>
          <mainClass>your.Application.fqdn.here</mainClass>
          <layout>ZIP</layout>
        </configuration>
        <executions>
          <execution>
            <goals>
              <goal>repackage</goal>
            </goals>
          </execution>
        </executions>
      </plugin>

如果你把整个文件（Akshat Mahajan的方式）塞进字符串s中：

import re
from itertools import takewhile

with open('yourfile') as fh:
    [print(re.match(r'^Level = "([^"]*)"',i).group(1)) for i in takewhile((lambda x: 'sample' not in x), fh) if 'Level' in i]

＆＃34;纯＆＃34;正则表达方式（丑陋）：从字符串的开头，到达第一个＆＃34; Level =＆＃34;在一行的开头出现并禁止单词＆＃34; sample＆＃34;

m = re.match(r'^Level = "([^"]*)"(?m)', s.split('sample', 1)[0])
if m: print m.group(1)

Answer 2

您可以使用正则表达式前瞻和后瞻的组合来提取数据。

你的正则表达式将是

(?<=Level = )\"((?:\d+\.)?\d+)\".*?(?=sample = \")

示例：

import re

with open('data.txt', 'r') as fp:
    data = fp.read()

    rx = re.compile(r"(?<=Level = )\"((?:\d+\.)?\d+)\".*?(?=sample = \")",
                    re.IGNORECASE | re.DOTALL)

    result = [int(num) for num in rx.findall(data)]

    print(result)

根据您在帖子中提供的示例内容，它将打印：

[2, 201]

Answer 3

如果Python可以做断言，这似乎有效。

(?s)^(?:(?!\b(?:Level|sample)\s*=\s*".*?").)*\bLevel\s*=\s*"(.*?)"(?:(?!\b(?:Level|sample)\s*=\s*".*?").)*\bsample\s*=\s*".*?"

捕获组1中的级别值

扩展

 (?s)                          # Dot-all modifier
 ^                             # BOS
 (?:                           # Not Level or Sample
      (?! \b (?: Level | sample ) \s* = \s* " .*? " )
      . 
 )*
 \b Level \s* = \s*            # Level
 "
 ( .*? )                       # (1), Value
 "
 (?:                           # Not Level or Sample
      (?! \b (?: Level | sample ) \s* = \s* " .*? " )
      . 
 )*
 \b sample \s* = \s* " .*? "   # Sample

输出

 **  Grp 0 -  ( pos 0 , len 64 ) 
(cell = "33"
Level = "2"

(
cGG Track
sample = "ThisSample"  
 **  Grp 1 -  ( pos 23 , len 1 ) 
2

正则字符串只在字符串B出现之前找到字符串A - Python

3 个答案: