Question

我正在尝试从字符串末尾删除特定的单词，直到这些字符串末尾不再包含这些单词为止。

我尝试了以下操作：

companylist=['dell inc corp', 'the co dell corp inc', 'the co dell corp inc co']

def rchop(thestring, ending):
  if thestring.endswith(ending):
    return thestring[:-len(ending)]
  return thestring

for item in companylist:
    item = rchop(item,' co')
    item = rchop(item,' corp')
    item = rchop(item,' inc')

我期望得到以下结果：

dell
the co dell
the co dell

但是我得到的却是这些结果：

dell
the co dell corp
the co dell corp

如何使结果不依赖于替换词的顺序，以便我的结果表示字符串末尾所有替换词的用尽？

Answer 1

如果最后一个单词出现在其他单词列表中，则可以使用它删除：

import re

string = "hello how are you"
words_to_remove = ["are", "you"]

space_positions = [x.start() for x in re.finditer('\ ', string)]
print(space_positions)
for i in reversed(space_positions):
    if string[i+1:] in words_to_remove:
        string = string[:i]

print(string)

哪个输出：

[5, 9, 13]
hello how

如果您只想删除最后一个单词，则不管它是什么，您都可以使用此代码：

import re

string = "hello how are you?"

space_positions = [x.start() for x in re.finditer('\ ', string)]
print(space_positions)
for i in reversed(space_positions):
    print(string[:i], '---', string[i:])

哪个输出：

[5, 9, 13]
hello how are ---  you?
hello how ---  are you?
hello ---  how are you?

string[:i]部分是第i个空格之前的所有内容，而string[i:]部分是第i个空格之后的所有内容。

Answer 2

使用正则表达式。

例如：

import re

companylist=['dell inc corp', 'co dell corp inc', 'co dell corp inc co']
for i in companylist:
    print(re.sub(r"\W(corp|inc|co)\b", "", i))

输出：

dell
co dell
co dell

Answer 3

您应该使用：

companylist = ['dell inc corp', 'co dell corp inc', 'co dell corp inc co']
for idx, item in enumerate(companylist):
    companylist[idx] = item.replace(' co', '')
    companylist[idx] = item.replace(' corp', '')
    companylist[idx] = item.replace(' inc', '')

或者感谢@RoadRunner：

companylist = [item.replace(' co', '').replace(' corp', '').replace(' inc', '') for item in companylist]

现在这两种情况：

print(companylist)

是：

['dell', 'co dell', 'co dell']

Answer 4

另一种完成方式：

>>> b = b'\xc4\x03\x00\x00\xe2\xecqv\x01'
>>> b[0]
196

>>> for i in b:
...     print(i)
... 
196
3
0
0
226
236
113
118
1

输出：

companylist=['dell inc corp', 'co dell corp inc', 'co dell corp inc co']    
repList = [' inc',' corp',' corp inc']   # list of all the chars to replace  

for elem, s in zip(repList, companylist):
    print(s.partition(elem)[0])

编辑：

使用dell co dell co dell：

list comprehension

输出：

print([s.partition(elem)[0] for (elem,s) in zip(repList,companylist)])

从字符串末尾删除特定的单词

4 个答案: