从字符串中删除超链接而不删除分隔符

时间:2018-04-02 08:47:24

标签: python regex pandas dataframe

我有一个字符串列表,其中包含需要删除的超链接。这个字符串由a,'描述。在末尾。

"这是一个字符串,里面有一个需要去的链接。 http://www.theonion.com,"

我尝试使用re.sub来执行此操作:

CleanedData = re.sub(r"http\S+", "", str(datachunk))

但是当字符串以超链接结束时,该函数删除了"这搞砸了。

"This is a string with a link in it that needs to go.

有没有办法告诉口译员单独留下划线员?

1 个答案:

答案 0 :(得分:0)

这应该有所帮助。使用\b仅匹配链接。

import re
s = "This is a string with a link in it that needs to go. http://www.theonion.com,"
print(re.sub(r"\bhttp\S+\b", "", s))

<强>输出:

This is a string with a link in it that needs to go. ,