说我有三个字符串:
abc534loif
tvd645kgjf
tv96fjbd_gfgf
和三个清单:
beginning
仅捕获字符串“the name”middle
仅捕获数字end
仅包含数字部分之后的其余字符如何以最有效的方式实现这一目标?
答案 0 :(得分:2)
使用正则表达式?
>>> import re
>>> strings = 'abc534loif tvd645kgjf tv96fjbd_gfgf'.split()
>>> for s in strings:
... for match in re.finditer(r'\b([a-z]+)(\d+)(.+?)\b', s):
... print match.groups()
...
('abc', '534', 'loif')
('tvd', '645', 'kgjf')
('tv', '96', 'fjbd_gfgf')
答案 1 :(得分:1)
这是与语言无关的方法,旨在提高效率:
p0
p1
0
到p0-1
将子字符串提取到beginning
p0
到p1
将子字符串提取到middle
p1+1
到length-1
将子字符串提取到end
答案 2 :(得分:1)
我猜您正在寻找re.findall:
strs = """
abc534loif
tvd645kgjf
tv96fjbd_gfgf
"""
import re
print re.findall(r'\b(\w+?)(\d+)(\w+)', strs)
>> [('abc', '534', 'loif'), ('tvd', '645', 'kgjf'), ('tv', '96', 'fjbd_gfgf')]
答案 3 :(得分:1)
>>> import itertools as it
>>> s="abc534loif"
>>> [''.join(j) for i,j in it.groupby(s, key=str.isdigit)]
['abc', '534', 'loif']
答案 4 :(得分:0)
我会使用正则表达式,如:
(?P<beginning>[^0-9]*)(?P<middle>[^0-9]*)(?P<end>[^0-9]*)
并拉出三个匹配的部分。
import re
m = re.match(r"(?P<beginning>[^0-9]*)(?P<middle>[^0-9]*)(?P<end>[^0-9]*)", "abc534loif")
m.group('beginning')
m.group('middle')
m.group('end')
答案 5 :(得分:0)
import re #You want to match a string against a pattern so you import the regular expressions module 're'
mystring = "abc1234def" #Just a string to test with
match = re.match(r"^(\D+)([0)9]+](\D+)$") #Our regular expression. Everything between brackets is 'captured', meaning that it is accessible as one of the 'groups' in the returned match object. The ^ sign matches at the beginning of a string, while the $ matches the end. the characters in between the square brackets [0-9] are character ranges, so [0-9] matches any digit character, \D is any non-digit character.
if match: # match will be None if the string didn't match the pattern, so we need to check for that, as None.group doesn't exist.
beginning = match.group(1)
middle = match.group(2)
end = match.group(3)
答案 6 :(得分:0)
我是这样的:
>>> import re
>>> l = ['abc534loif', 'tvd645kgjf', 'tv96fjbd_gfgf']
>>> regex = re.compile('([a-z_]+)(\d+)([a-z_]+)')
>>> beginning, middle, end = zip(*[regex.match(s).groups() for s in l])
>>> beginning
('abc', 'tvd', 'tv')
>>> middle
('534', '645', '96')
>>> end
('loif', 'kgjf', 'fjbd_gfgf')