Regex Multiline Python

时间:2016-05-30 12:41:07

标签: python regex

我目前正在尝试在python上执行一个应该匹配多行的正则表达式。

([0-9]{2}\.[0-9]{2}\.[0-9]{4}\s[0-9]{2}:[0-9]{2}:[0-9]{2}).*?\r\n-{1,}\sFG\s{3,}?4

是我的正则表达式,这是我的python调用

re.findall("([0-9]{2}\.[0-9]{2}\.[0-9]{4}\s[0-9]{2}:[0-9]{2}:[0-9]{2}).*?\r\n-{1,}\sFG\s{3,}?4.*?", content, flags=re.M)

然而,当我在Notepad ++中使用正则表达式时,它为我提供了正确的匹配,而在python中它根本不匹配任何东西(这里是一个在npp中匹配但不在python中匹配的示例字符串)

19.04.2016 01:59:18 ASDF

---- FG 3

 --------------- ASDF

19.04.2016 01:59:21 ASDF

---- FG 4

 --------------- ASDF

19.04.2016 01:59:22 ASDF

---- FG 4

 --------------- ASDF

我也确定实际上有一个\ r \ n,因为npp为我提供了匹配。

由于我使用多行标记,我完全不知道为什么我的正则表达式不起作用。

4 个答案:

答案 0 :(得分:2)

请注意,在显示的更正输入中,模式的部分//Mysql connect info $servername = "removed"; $username = "removed"; $password = "removed"; $database = "removed"; $link = new mysqli($servername, $username, $password, $database); if ($link->connect_error) { die("Connection failed: " . $link->connect_error); } //Get or send the information from/to the game $func = $_GET['func']; $hash = $link->real_escape_string($_GET['user']); $bonus = $link->real_escape_string($_GET['bonus']); class CheckUser { function CheckBonus($hash) { global $link; global $hash; $query = "SELECT bonustier, resettime, userid FROM members WHERE userid = ?"; if ($stmt = $link->prepare($query)) { $stmt->bind_param('s', $hash); $stmt->execute(); $stmt->store_result(); if($stmt->num_rows > 0) { $stmt->bind_result($bonustier, $resettime, $hash); while($stmt->fetch()) { //Check if enough time has passed since last login bonus $time = time(); $time2 = ((($time - $resettime) / 60) / 60); $time3 = substr($time2, 0, strpos($time2, ".")); if($time3 >= 24) { $connection = 1; $usergood = 1; $loginbonus = 1; $hoursince = $time3; $bonusarray = array( 'BonusCheck' => array( 'connection' => $connection, 'usergood' => $usergood, 'loginbonus' => $loginbonus, 'bonustier' => $bonustier, 'hoursince' => $time3 ), ); $bonusjson = json_encode($bonusarray, 128); echo $bonusjson; } else //If enough time has not passed, tell the game. { $connection = 1; $usergood = 1; $loginbonus = 0; $hoursince = $time3; $bonusarray = array( 'BonusCheck' => array( 'connection' => $connection, 'usergood' => $usergood, 'loginbonus' => $loginbonus, 'bonustier' => $bonustier, 'hoursince' => $time3 ), ); $bonusjson = json_encode($bonusarray, 128); echo $bonusjson; } } } else { $this->AddUser($hash); } } $stmt->close(); } function UpdateBonus($hash, $bonus) { global $link; global $hash; global $bonus; if($stmt = $link->prepare("UPDATE members SET loginbonus = ?, bonustier = ?, resettime = ? WHERE userid=?")) { $stmt->bind_param('ssss', $loginbonus, $bonustier, $resettime, $hash); switch($bonus) { case 0: $bonustier = 1; break; case 1: $bonustier = 2; break; case 2: $bonustier = 3; break; case 3: $bonustier = 4; break; case 4: $bonustier = 5; break; case 5: $bonustier = 0; break; } $loginbonus = 0; $resettime = time(); $success = '1'; $updatearray = array( 'UpdateBonus' => array( 'login' => $loginbonus, 'result' => $success ), ); $updatejson = json_encode($updatearray, 128); echo $updatejson; $stmt->execute(); $stmt->close(); } } function AddUser($hash) { global $link; global $hash; if($stmt = $link->prepare("INSERT INTO members (userid, loginbonus, bonustier, resettime) VALUES (?, ?, ?, ?)")) { $stmt->bind_param('ssss', $userid, $loginbonus, $bonustier, $resettime); $userid = $hash; $loginbonus = 1; $bonustier = 1; $resettime = time(); $stmt->execute(); $stmt->close(); $this->CheckBonus($hash); } } function CheckFunc($func) { switch($func) { case cb: $this->CheckBonus($hash); break; case ub: $this->UpdateBonus($hash, $bonus); break; case au: $this->AddUser($hash); break; default: echo 'Function not found.'; break; } } } $newObject = new CheckUser(); $newObject->CheckFunc($func); ?> 会避免匹配,因为FG\s{3,}?4FG之间的单个空格不匹配。

4

给了我(未经修改的python 2.7.11和3.5.1):

#! /usr/bin/env python
from __future__ import print_function
import re    

content = "19.04.2016 05:31:03 ASDFASDF\r\n---- FG 4 "
pattern = (r'([0-9]{2}\.[0-9]{2}\.[0-9]{4}\s[0-9]{2}:[0-9]{2}:[0-9]{2}).*?'
           r'\r\n-{1,}\sFG\s{1,}?4.*?')
print(re.findall(pattern, content, flags=re.M))

编辑:这里是由@poke转录的更新后的修改输入样本的版本:

['19.04.2016 05:31:03']

给予(如预期):

#! /usr/bin/env python
from __future__ import print_function
import re

content = ("19.04.2016 05:31:03  ASDFASDF\r\n---- FG   4"
           "\r\n19.04.2016 05:31:03  ASDFASDF\r\n---- FG   4"
           "\r\n19.04.2016 05:31:03  ASDFASDF\r\n---- FG   4"
           "\r\n19.04.2016 05:31:03  ASDFASDF\r\n---- FG   4"
           "\r\n19.04.2016 05:31:03  ASDFASDF\r\n---- FG   4")
pattern = (r'([0-9]{2}\.[0-9]{2}\.[0-9]{4}\s[0-9]{2}:[0-9]{2}:[0-9]{2}).*?'
           r'\r\n-{1,}\sFG\s{1,}?4.*?')
print(re.findall(pattern, content, flags=re.M))

答案 1 :(得分:0)

如果您的输入包含\ r \ n的换行符,并且您更正了'FG'部分之后的间距,则应该有效:

([0-9]{2}\.[0-9]{2}\.[0-9]{4}\s[0-9]{2}:[0-9]{2}:[0-9]{2}).*?\r\n-{1,}\sFG\s+?4

在此处测试(仅限换行符\ n): https://regex101.com/r/iT1rF2/2

答案 2 :(得分:0)

适合我:

['19.04.2016 05:31:03', '19.04.2016 05:31:03', '19.04.2016 05:31:03', '19.04.2016 05:31:03', '19.04.2016 05:31:03']

请注意,我必须在每行末尾添加显式>>> content = '''19.04.2016 05:31:03 ASDFASDF\r ---- FG 4\r 19.04.2016 05:31:03 ASDFASDF\r ---- FG 4\r 19.04.2016 05:31:03 ASDFASDF\r ---- FG 4\r 19.04.2016 05:31:03 ASDFASDF\r ---- FG 4\r 19.04.2016 05:31:03 ASDFASDF\r ---- FG 4''' >>> re.findall("([0-9]{2}\.[0-9]{2}\.[0-9]{4}\s[0-9]{2}:[0-9]{2}:[0-9]{2}).*?\r\n-{1,}\sFG\s{3,}?4.*?", content, flags=re.M) ['19.04.2016 05:31:03', '19.04.2016 05:31:03', '19.04.2016 05:31:03', '19.04.2016 05:31:03', '19.04.2016 05:31:03'] 。您说您的文字包含实际的\r,但请确保是这种情况。

如果您正在阅读文件中的内容,请注意Python执行换行标准化when you open a file。因此,尽管文件最初包含\r\n,但最终只能使用\n

答案 3 :(得分:0)

请注意,您可以使用快捷方式重写正则表达式,以便您的模式:

([0-9]{2}\.[0-9]{2}\.[0-9]{4}\s[0-9]{2}:[0-9]{2}:[0-9]{2}).*?\r\n-{1,}\sFG\s{3,}?4

成为(快捷方式和更正):

(\d{2}\.\d{2}\.\d{4}\s\d{2}:\d{2}:\d{2}).*?\r\n-+\sFG\s+?4