Question

我正在努力吸引文章中提到的用户。也就是说，以@符号开头的单词然后将<和>包裹起来。

我做了什么：

def getUsers(content):
    users = []
    l = content.split(' ')
    for user in l:
        if user.startswith('@'):
            users.append(user)
    return users

old_string = "Getting and replacing mentions of users. @me @mentee @you @your @us @usa @wo @world @word @wonderland"

users = getUsers(old_string)

new_array = old_string.split(' ')

for mention in new_array:
    for user in users:
        if mention == user and len(mention) == len(user):
            old_string = old_string.replace(mention, '<' + user + '>')

print old_string
print users

代码表现得很有趣。它包含以相同的字母开头的单词，甚至截断后面的内容，如下面的打印所示：

结果：

Getting and replacing mentions of users. <@me> <@me>ntee <@you> <@you>r <@us> <@us>a <@wo> <@wo>rld <@wo>rd <@wo>nderland
['@me', '@mentee', '@you', '@your', '@us', '@usa', '@wo', '@world', '@word', '@wonderland']

预期结果：

Getting and replacing mentions of users. <@me> <@mentee> <@you> <@your> <@us> <@usa> <@wo> <@world> <@word> <@wonderland>
['@me', '@mentee', '@you', '@your', '@us', '@usa', '@wo', '@world', '@word', '@wonderland']

Process finished with exit code 0

为什么会发生这种情况？如何以正确的方式做到这一点？

Answer 1

为什么会发生这种情况：当您拆分字符串时，会进行大量检查以确保您正在查看正确的用户，例如您有@me和@mentee - 因此对于用户me，它将匹配第一个，而不是第二个。

但是，当您进行替换时，您正在替换整个字符串 - 所以当您说要替换例如对@me <@me> @me，它对您的谨慎分割一无所知 - 它只是在字符串中查找@mentee并替换它。因此@me还包含new_array = old_string.split(' ') for index, mention in enumerate(new_array): for user in users: if mention == user and len(mention) == len(user): #We won't replace this in old_string, we'll replace the current entry #old_string = old_string.replace(a, '<' + user + '>') new_array[index] = '<%s>'%user new_string = ' '.join(new_array)，并且会被替换。

两个（好的，三个）选择：一个是在它周围加上间隔，以便对它进行选择（就像@parchment写的那样）。

其次是使用您的分割：替换原始字符串，而不是替换原始字符串。最简单的方法是使用枚举：

'@anything'

第三种方式......这有点复杂，但你真正想要的是<@anything>的任何实例都被re.sub替换（也许用空格？）。您可以使用new_string = re.sub(r'(@\w+)', r'<\g<0>>', old_string)一次性完成此操作：

{{1}}

Answer 2

我之前的回答完全基于纠正当前代码中的问题。但是，有一种更好的方法可以做到这一点，即使用正则表达式。

import re

oldstring = re.sub(r'(@\w+)\b', r'<\1>', oldstring)

有关详细信息，请参阅re模块上的文档。

Answer 3

由于@me首先出现在您的数组中，因此您的代码会替换@me中的@mentee。

解决这个问题的最简单方法是在要替换的用户名之后添加一个空格：

old_string = old_string.replace(a + ' ', '<' + user + '> ')
                # I added space here ^         and here ^

但是出现了一个新问题。最后一个字没有被包裹，因为它之后没有空格。解决它的一个非常简单的方法是：

oldstring = oldstring + ' '

for mention in ... # Your loop

oldstring = oldstring[:-1]

Answer 4

只要用户名旁边没有任何标点符号（如逗号），这就行了。

def wrapUsers(content):
    L = content.split()
    newL = []
    for word in L:
        if word.startswith('@'): word = '<'+word+'>'
        newL.append(word)
    return " ".join(newL)

Python字符串替换表现奇怪

4 个答案: