我有一些带有一些前缀的字符串列表:
prefixes = [u'path', u'folder', u'directory', u'd']
和一些字符串如
s1 = u'path common path and directory'
s2 = u'directory common path and directory'
s3 = u'directory folder distinct and directory folder'
s4 = u'distinct and directory folder'
s5 = u'd fixable directory or folder'
我只需删除与列表中的一个匹配的前缀:
# after processing
s1 = u'common path and directory'
s2 = u'common path and directory'
s3 = u'folder distinct and directory folder'
s4 = u'distinct and directory folder'
s5 = u'fixable directory or folder'
我尝试使用
''.join([word for word in s1.split() if word not in prefixes])
或
for prefix in prefixes:
if s1.startswith(prefix):
return s1[len(prefix):]
但是这会删除字符串中任意位置的前缀或与整个单词不匹配(请注意我在那里有d
,这与directory
匹配,仅提供irectory
) ,不仅在一开始。有没有办法不使用正则表达式?
答案 0 :(得分:1)
如果您只想搜索整个单词,它们将被空格字符终止。我建议你将它附加到前缀:
prefixes = [u'path', u'folder', u'directory', u'd']
mystrings = [u'path common path and directory', u'directory common path and directory', u'directory folder distinct and directory folder', u'distinct and directory folder', u'd fixable directory or folder']
for s in mystrings:
for prefix in prefixes:
if s.startswith(prefix+" "):
print s[len(prefix)+1:]
>>>
common path and directory
common path and directory
folder distinct and directory folder
fixable directory or folder
答案 1 :(得分:0)
我会按" "
分割得到第一个字,如果它在前缀列表中,则将其删除。
firstWord=s1.split(" ")[0]
if firstWord in prefixes:
s1=" ".join(s1.split(" ")[1:])
您还可以使用split()
答案 2 :(得分:0)
我建议partition
或split
限制他们为此做好准备。
prefixes = [u'path', u'folder', u'directory', u'd']
strings = [u'path common path and directory',
u'directory common path and directory'
u'directory folder distinct and directory folder',
u'distinct and directory folder',
u'd fixable directory or folder']
分区返回包含head,sep和tail的3个项目的元组。头是分隔符之前的所有内容,sep是分隔字符串的分隔符,尾部是后面的所有内容。用[2]索引它只会抓住尾巴。
res = []
for s in strings:
s2 = s.partition(' ')
if s2[0] in prefixes:
res.append(s2[2])
else:
res.append(s)
print(res)
#List comp
print([s.partition(' ')[2] if s.partition(' ')[0] in prefixes else s for s in strings])
#Output for s1 | (head, sep, tail)
[0] | "path"
[1] | " "
[2] | "common path and directory"
使用限制进行拆分会生成一个列表,在该列表中,它仅在分隔符处分割指定的次数,然后添加剩余的所有内容。所以限制为1则长度最多为2。
res = []
for s in strings:
s2 = s.split(' ', 1)
if s2[0] in prefixes:
res.append(s2[1])
else:
res.append(s)
print(res)
#List comp
print([s.split(' ', 1)[1] if s.split(' ', 1)[0] in prefixes else s for s in strings])
#Output for s1 | [first item, everything else]
[0] | "path"
[1] | "common path and directory"
答案 3 :(得分:0)
您可以使用此功能
def func(s):
pr = s.split()[0]
if pr in prefixes:
return ' '.join(s2.split()[1:])
这将获取第一个单词并查看它是否存在于prefixes
中。如果是,则删除该单词。