如何根据某些规则更改字符串?

时间:2016-04-13 05:35:03

标签: python string

我有以下文字,每行有两个短语,并以"\t"

分隔
RoadTunnel    RouteOfTransportation
LaunchPad   Infrastructure
CyclingLeague   SportsLeague
Territory   PopulatedPlace
CurlingLeague   SportsLeague
GatedCommunity  PopulatedPlace

我想要的是将_添加到单独的单词中,结果应为:

Road_Tunnel    Route_Of_Transportation
Launch_Pad  Infrastructure
Cycling_League  Sports_League
Territory   Populated_Place
Curling_League  Sports_League
Gated_Community Populated_Place

没有"ABTest""aBTest"这样的情况,并且有三个单词在一起的情况"RouteOfTransportation"我尝试了几种方法但没有成功。

我的一个尝试是:

textProcessed = re.sub(r"([A-Z][a-z]+)(?=([A-Z][a-z]+))", r"\1_", text)

但是没有结果

3 个答案:

答案 0 :(得分:4)

使用正则表达式和re.sub

>>> import re
>>> s = '''LaunchPad   Infrastructure
... CyclingLeague   SportsLeague
... Territory   PopulatedPlace
... CurlingLeague   SportsLeague
... GatedCommunity  PopulatedPlace'''
>>> subbed = re.sub('([A-Z][a-z]+)([A-Z])', r'\1_\2', s)
>>> print(subbed)
Launch_Pad   Infrastructure
Cycling_League   Sports_League
Territory   Populated_Place
Curling_League   Sports_League
Gated_Community  Populated_Place

编辑:这是另一个,因为您的测试用例不足以确定您想要的内容:

>>> re.sub('([a-zA-Z])([A-Z])([a-z])', r'\1_\2\3', 'ABThingThing')
'AB_Thing_Thing'

答案 1 :(得分:2)

合并re.findallstr.join

>>> "_".join(re.findall(r"[A-Z]{1}[^A-Z]*", text))

答案 2 :(得分:2)

根据您的需要,可以采用略有不同的解决方案:

import re
result = re.sub(r"([a-zA-Z])(?=[A-Z])", r"\1_", s)

它将在跟随另一个字母的任何大写字母之前插入_(无论是大写还是小写)。

  • "TheRabbit IsBlue" => "The_Rabbit Is_Blue"
  • "ABThing ThingAB" => "A_B_Thing Thing_A_B"

它不支持特殊字符。