Question

我创建了一个用于匹配字符串中第n个出现的正则表达式：

^(?:[^-]*-){2}([^-].*)

但是，在regex tool中测试它并没有得到100％匹配的解决方案：

例如：

原产地：动物 - 动物愤怒XL前锻炼葡萄 - 151克

预期：动物 - 动物愤怒XL前锻炼葡萄的愤怒

测试：动物 - Animal Rage XL Pre

原产地：AST Sports Science - R-ALA 200 - 90粒胶囊

预计：AST体育科学 - R-ALA 200

测试：AST Sports Science - R

据我所知，在上面给出的正则表达式中它匹配第二次出现“ - ”，并且我创建了下一个正则表达式：

^(?:[^-]*\s-\s){2}([^-].*)

但它完全错过了上面的例子。

完美的正则表达式工作让我想念的是什么？

感谢您的帮助。

Answer 1

您可以尝试下面的内容。

>>> s = 'ANIMAL - Animal Rage XL Pre-Workout Grape of Wrath - 151 Grams'
>>> s1 = 'AST Sports Science - R-ALA 200 - 90 Capsules'
>>> re.search(r'^(?:.*? - .*?)(?= - )', s).group()
'ANIMAL - Animal Rage XL Pre-Workout Grape of Wrath'
>>> re.search(r'^(?:.*? - .*?)(?= - )', s1).group()
'AST Sports Science - R-ALA 200'

https://regex101.com/r/sJ9gM7/29

您也可以使用re.sub功能。

>>> re.sub(r' - (?:(?! - ).)*$', '', s)
'ANIMAL - Animal Rage XL Pre-Workout Grape of Wrath'
>>> re.sub(r' - (?:(?! - ).)*$', '', s1)
'AST Sports Science - R-ALA 200'

这匹配<space>hyphen<space>分隔字符串的最后一部分。用空字符串替换匹配将为您提供所需的输出。

Answer 2

看起来你正在寻找这个正则表达式：(?m)^(.*)(\s+\-\s+(?!\s\-\s).*)$

Python中的示例代码：

str1 = 'ANIMAL - Animal Rage XL Pre-Workout Grape of Wrath - 151 Grams'
str2 = 'Anjolie Ayurveda - Rosemary Lavender and Neem Tulsi Soap Herbal Gift Box - CLEARANCE PRICED Nourish Your Skin & Awaken Your Senses'
print re.sub(r"(?m)^(.*)(\s+\-\s+(?!\s\-\s).*)$", "\g<1>", str1)
print re.sub(r"(?m)^(.*)(\s+\-\s+(?!\s\-\s).*)$", "\g<1>", str2)

输出：

ANIMAL - Animal Rage XL Pre-Workout Grape of Wrath                                                                                                                                                                                                     
Anjolie Ayurveda - Rosemary Lavender and Neem Tulsi Soap Herbal Gift Box

正则表达式：匹配第二次出现

2 个答案: