从字符串末尾删除特定的单词

时间:2019-03-25 07:14:08

标签: python string replace

我正在尝试从字符串末尾删除特定的单词,直到这些字符串末尾不再包含这些单词为止。

我尝试了以下操作:

companylist=['dell inc corp', 'the co dell corp inc', 'the co dell corp inc co']

def rchop(thestring, ending):
  if thestring.endswith(ending):
    return thestring[:-len(ending)]
  return thestring

for item in companylist:
    item = rchop(item,' co')
    item = rchop(item,' corp')
    item = rchop(item,' inc')

我期望得到以下结果:

dell
the co dell
the co dell

但是我得到的却是这些结果:

dell
the co dell corp
the co dell corp

如何使结果不依赖于替换词的顺序,以便我的结果表示字符串末尾所有替换词的用尽?

4 个答案:

答案 0 :(得分:2)

如果最后一个单词出现在其他单词列表中,则可以使用它删除:

import re

string = "hello how are you"
words_to_remove = ["are", "you"]

space_positions = [x.start() for x in re.finditer('\ ', string)]
print(space_positions)
for i in reversed(space_positions):
    if string[i+1:] in words_to_remove:
        string = string[:i]

print(string)

哪个输出:

[5, 9, 13]
hello how

如果您只想删除最后一个单词,则不管它是什么,您都可以使用此代码:

import re

string = "hello how are you?"

space_positions = [x.start() for x in re.finditer('\ ', string)]
print(space_positions)
for i in reversed(space_positions):
    print(string[:i], '---', string[i:])

哪个输出:

[5, 9, 13]
hello how are ---  you?
hello how ---  are you?
hello ---  how are you?

string[:i]部分是第i个空格之前的所有内容,而string[i:]部分是第i个空格之后的所有内容。

答案 1 :(得分:2)

使用正则表达式。

例如:

import re

companylist=['dell inc corp', 'co dell corp inc', 'co dell corp inc co']
for i in companylist:
    print(re.sub(r"\W(corp|inc|co)\b", "", i))

输出:

dell
co dell
co dell

答案 2 :(得分:0)

您应该使用:

companylist = ['dell inc corp', 'co dell corp inc', 'co dell corp inc co']
for idx, item in enumerate(companylist):
    companylist[idx] = item.replace(' co', '')
    companylist[idx] = item.replace(' corp', '')
    companylist[idx] = item.replace(' inc', '')

或者感谢@RoadRunner:

companylist = [item.replace(' co', '').replace(' corp', '').replace(' inc', '') for item in companylist]

现在这两种情况:

print(companylist)

是:

['dell', 'co dell', 'co dell']

答案 3 :(得分:0)

另一种完成方式:

>>> b = b'\xc4\x03\x00\x00\xe2\xecqv\x01'
>>> b[0]
196

>>> for i in b:
...     print(i)
... 
196
3
0
0
226
236
113
118
1

输出

companylist=['dell inc corp', 'co dell corp inc', 'co dell corp inc co']    
repList = [' inc',' corp',' corp inc']   # list of all the chars to replace  

for elem, s in zip(repList, companylist):
    print(s.partition(elem)[0])

编辑

使用dell co dell co dell

list comprehension

输出

print([s.partition(elem)[0] for (elem,s) in zip(repList,companylist)])