Question

假设我有一个这样的文件名，我想在Python中将其一部分提取为字符串

import re
fn = "DC_QnA_bo_v.15.12.3_DE_duplicates.xlsx"
rgx = re.compile('\b_[A-Z]{2}\b')
print(re.findall(rgx, fn))

预期的出局数[DE]，但实际的出局数为[]。

Answer 1

您可以使用

(?<=_)[A-Z]+(?=_)

这利用了双方的环顾四周，请参见a demo on regex101.com。为了获得更严格的结果，您需要指定更多示例输入。

Answer 2

使用_([A-Z]{2})

例如：

import re
fn = "DC_QnA_bo_v.15.12.3_DE_duplicates.xlsx"
rgx = re.compile('_([A-Z]{2})')
print(rgx.findall(fn))           #You can use the compiled pattern to do findall.

输出：

['DE']

Answer 3

您期望的输出似乎是curl -s -X GET -u semp_user:semp_pass management_host:management_port/SEMP/v2/monitor/msgVpns/{vpn-name}/queues?select="queueName" curl -s -X GET -u semp_user:semp_pass management_host:management_port/SEMP/v2/monitor/msgVpns/{vpn-name}/topicEndpoints?select="topicEndpointName"，它的左右两侧有两个DE。此表达式也可能起作用：

输出

# -*- coding: UTF-8 -*-
import re

string = "DC_QnA_bo_v.15.12.3_DE_duplicates.xlsx"
expression = r'_([A-Z]+)_'
match = re.search(expression, string)
if match:
    print("YAAAY! \"" + match.group(1) + "\" is a match  ")
else: 
    print(' Sorry! No matches!')

或者，如果需要，您可以添加YAAAY! "DE" is a match量词：

DEMO

Answer 4

尝试模式：\_([^\_]+)\_[^\_\.]+\.xlsx

说明：

\_-从字面上匹配_

[^\_]+-带有+运算符的否定字符类：匹配除_之外的一个或多个字符

[^\_\.]+-与上面相同，但是这次匹配_和.

以外的字符

\.xlsx-从字面上匹配.xlsx

Demo

想法是匹配扩展名_something_前的最后一个模式.xlsx

Answer 5

您可以使用正则表达式（re模块）进行显示，但是可以通过以下方式不使用任何import来做到这一点：

fn = "DC_QnA_bo_v.15.12.3_DE_duplicates.xlsx"
out = [i for i in fn.split('_')[1:] if len(i)==2 and i.isalpha() and i.isupper()]
print(out) # ['DE']

说明：我在fn处拆分了_，然后丢弃了第一个元素和过滤器元素，因此仅保留了长度为2（由字母和大写字母组成）的str。

Answer 6

另一种re解决方案：

rgx = re.compile('_([A-Z]{1,})_')
print(re.findall(rgx, fn))

在python

6 个答案:

输出

DEMO