Question

我有一个类似

的主机名

ab-test-db-dev.0002-colo1-vm234.abc.domain.com

（是的，主机名内部没有惯例。）

我试图将此主机名拆分为

ab-test-db-dev.0002-colo1-vm234

模式将与'。'分开，但前提是该点后面没有其他特殊字符。

我试过

pattern = domain.split(".")

但只是直到

ab-test-db-dev and not ab-test-db-dev.0002-colo1-vm234

作为第一个元素。

实现这一目标的最佳方法是什么？

Answer 1

你可以移除第一部分直到不再留下破折号;这是要从主机名中删除的域名：

hostname = domain
while '-' in domain:
    domain = domain.partition('.')[-1]
hostname = hostname[:-len(domain) - 1]

或者反过来，如果不包含破折号，请删除最后一部分str.rpartition()：

hostname = domain
while True:
    first, _, end = hostname.rpartition('.')
    if '-' in end:
        break
    hostname = first

使用正则表达式查找只包含字母和点的任何部分：

import re

hostname = re.sub(r'\.[a-z.]+$', '', domain)

演示：

>>> domain = 'ab-test-db-dev.0002-colo1-vm234.abc.domain.com'
>>> hostname = domain
>>> while '-' in domain:
...     domain = domain.partition('.')[-1]
... 
>>> hostname[:-len(domain) - 1]
'ab-test-db-dev.0002-colo1-vm234'
>>> domain = 'ab-test-db-dev.0002-colo1-vm234.abc.domain.com'
>>> hostname = domain
>>> while True:
...     first, _, end = hostname.rpartition('.')
...     if '-' in end:
...         break
...     hostname = first
... 
>>> hostname
'ab-test-db-dev.0002-colo1-vm234'
>>> import re
>>> re.sub(r'\.[a-z.]+$', '', domain)
'ab-test-db-dev.0002-colo1-vm234'

Answer 2

没有获得模式，但在这种情况下，后续工作可以正常工作。

(?<=\d)\.

试试这个。看看演示。

https://regex101.com/r/rU8yP6/21

使用re.split。

 import re
 re.split(r"(?<=\d)\.",test_Str)

或者

^(.*?)(?!.*-)\.

试试这个。看看演示。

https://regex101.com/r/rU8yP6/22

import re
print re.findall(r"^(.*?)(?!.*-)\.",test_str)

Answer 3

如果我正确地理解了你的问题，那么这个正则表达式应该可以胜任：

.*?(?=\.(?!.*[^\w.]))

>>> print re.match(r'.*?(?=\.(?!.*[^\w.]))', 'ab-test-db-dev.0002-colo1-vm234.abc.domain.com')
ab-test-db-dev.0002-colo1-vm234

说明：

.*? # match everything up to...
(?=
    \. # the first dot...
    (?! # that isn't followed by...
        .* # any text and...
        [^\w.] # something that's not a word character or a dot.
    )
)

当关键字出现在模式之后时，python会拆分字符串

3 个答案: