Question

有这样一个字符串：

<p>Millions of people watch TV.</p><br/>https://sites.google.com/aaa-net.bb.cc/be-do-have/%E3%83%9B%E3%83%BC%E3%83%A0<br/><p>Good boy!</p><br/>

我要删除内容：

https://sites.google.com/aaa-net.bb.cc/be-do-have/%E3%83%9B%E3%83%BC%E3%83%A0

只要保持：

<p>Millions of people watch TV.</p><br/><br/><p>Good boy!</p><br/>

我的代码：

mystring = '<p>Millions of people watch TV.</p><br/>https://sites.google.com/aaa-net.bb.cc/be-do-have/%E3%83%9B%E3%83%BC%E3%83%A0<br/><p>Good boy!</p><br/>'

如何做到？

Answer 1

您可以在正则表达式模块中使用re.sub：

import re
mystring = '<p>Millions of people watch TV.</p><br/>https://sites.google.com/aaa-net.bb.cc/be-do-have/%E3%83%9B%E3%83%BC%E3%83%A0<br/><p>Good boy!</p><br/>'
print(re.sub(r'http[^<]+', '', mystring))

输出：

<p>Millions of people watch TV.</p><br/><br/><p>Good boy!</p><br/>

Answer 2

您可以使用正则表达式替换：

查找：<br/>https?://[^<]*</br>

替换：<br/></br>

Answer 3

mystring = '<p>Millions of people watch TV.</p><br/>https://sites.google.com/aaa-net.bb.cc/be-do-have/%E3%83%9B%E3%83%BC%E3%83%A0<br/><p>Good boy!</p><br/>'
# remove 'https://sites.google.com/aaa-net.bb.cc/be-do-have/%E3%83%9B%E3%83%BC%E3%83%A0'
resultstring = '<p>Millions of people watch TV.</p><br/><br/><p>Good boy!</p><br/>'

length = len(mystring)
startPos = -1
endPos = -1
for i in range(length):
    subString = mystring[i:]
    if subString.startswith('<br/>'):
        if(startPos == -1):
            startPos = i
            continue # check from next character to get endPos

        if(endPos == -1):
            endPos = i


firstSubString = mystring[:startPos + 5] # 5 = the characher size of '<br/>'
lastSubString = mystring[endPos:]


completeResult = firstSubString + lastSubString
print(completeResult, completeResult == resultstring)
print(completeResult, resultstring)

Answer 4

import re

mystring = '<p>Millions of people watch TV.</p><br/>https://sites.google.com/aaa- 
net.bb.cc/be-do-have/%E3%83%9B%E3%83%BC%E3%83%A0<br/><p>Good boy!</p><br/>'
print(re.sub("(?:<br/>https)([\s\S]*?)(?=<br/>)",'<br/>',mystring))

输出：

<p>Millions of people watch TV.</p><br/><br/><p>Good boy!</p><br/>

python在<br/>和<br/>之间替换url的内容

4 个答案: