I have String like below
tweet = "thank you guys, for coming my birthday @abcd @defg @hijk , and @abcd don't forget your promises"
How to change that tweet to be
tweet = "thank you guys, for coming my birthday USERNAME_TWITTER_1 USERNAME_TWITTER_2 USERNAME_TWITTER_3 , and USERNAME_TWITTER_1 don't forget your promises"
`
答案 0 :(得分:2)
您可以使用id_dispatcher
功能:
from itertools import count
def id_dispatcher():
return lambda c=count(1): next(c)
然后我们可以从defaultdict
包中设置collections
离子:
from collections import defaultdict
dc = defaultdict(id_dispatcher())
然后使用regex replacement(请参阅构建 Twitter 用户名正则表达式的链接):
import re
re_user = re.compile(r'(?<=^|(?<=[^a-zA-Z0-9-_\.]))@([A-Za-z]+[A-Za-z0-9]+)')
outp = re_user.sub(lambda x : 'USERNAME_TWITTER_%s'%dc[x.group(0)],tweet)
这会产生:
>>> re_user.sub(lambda x : 'USERNAME_TWITTER_%s'%dc[x.group(0)],tweet)
"thank you guys, for coming my birthday USERNAME_TWITTER_1 USERNAME_TWITTER_2 USERNAME_TWITTER_3 , and USERNAME_TWITTER_1 don't forget your promises"