我有这个示例字符串:
- name: ec2_prov - set fact for all ci_machine_ips
set_fact: private_ips="{{ item.private_ip }}"
with_items: "{{ ci_ec2.instances }}"
register: ci_ec2_ip_results
我需要提取“标记针”之前的路径(不带斜线)。下面列出所有路径:
line = '[text] something - https://www.myurl.com/test1/ lorem ipsum https://www.myurl.com/test2/ - https://www.myurl.com/test3/ marker needle - some more text at the end'
但是,当我更改它以仅找到所需的路径(“标记针”之前的路径)时,它给出了一个奇怪的输出:
print re.findall('https://www\\.myurl\\.com/(.+?)/', line)
# ['test1', 'test2', 'test3']
我的预期输出:
print re.findall('https://www\\.myurl\\.com/(.+?)/ marker needle', line)
# ['test1/ lorem ipsum https://www.myurl.com/test2/ - https://www.myurl.com/test3']
我用test3
尝试过相同的操作,但结果是相同的。
答案 0 :(得分:2)
此表达式具有三个捕获组,其中第二个具有我们所需的输出:
(https:\/\/www.myurl.com\/)([A-Za-z0-9-]+)(\/\smarker needle)
This tool可以帮助我们修改/更改表达式。
jex.im可视化正则表达式:
# -*- coding: UTF-8 -*-
import re
string = "[text] something - https://www.myurl.com/test1/ lorem ipsum https://www.myurl.com/test2/ - https://www.myurl.com/test3/ marker needle - some more text at the end"
expression = r'(https:\/\/www.myurl.com\/)([A-Za-z0-9-]+)(\/\smarker needle)'
match = re.search(expression, string)
if match:
print("YAAAY! \"" + match.group(2) + "\" is a match ")
else:
print(' Sorry! No matches!')
YAAAY! "test3" is a match
此代码段返回一百万次for
循环的运行时间。
const repeat = 10;
const start = Date.now();
for (var i = repeat; i >= 0; i--) {
const regex = /(.*)(https:\/\/www.myurl.com\/)([A-Za-z0-9-]+)(\/\smarker needle)(.*)/gm;
const str = "[text] something - https://www.myurl.com/test1/ lorem ipsum https://www.myurl.com/test2/ - https://www.myurl.com/test3/ marker needle - some more text at the end";
const subst = `$3`;
var match = str.replace(regex, subst);
}
const end = Date.now() - start;
console.log("YAAAY! \"" + match + "\" is a match ");
console.log(end / 1000 + " is the runtime of " + repeat + " times benchmark test. ");