我正在尝试重写我在this回答中看到的代码:
import re
pat1 = re.compile(r"(^|[\n ])(([\w]+?://[\w\#$%&~.\-;:=,?@\[\]+]*)(/[\w\#$%&~/.\-;:=,?@\[\]+]*)?)", re.IGNORECASE | re.DOTALL)
pat2 = re.compile(r"#(^|[\n ])(((www|ftp)\.[\w\#$%&~.\-;:=,?@\[\]+]*)(/[\w\#$%&~/.\-;:=,?@\[\]+]*)?)", re.IGNORECASE | re.DOTALL)
urlstr = 'http://www.example.com/foo/bar.html'
urlstr = pat1.sub(r'\1<a href="\2" target="_blank">\3</a>', urlstr)
urlstr = pat2.sub(r'\1<a href="http:/\2" target="_blank">\3</a>', urlstr)
print urlstr
具体来说,我试过这个:
pattern = re.compile('<a href="javascript:rt\(([0-9]+)\)">Download</a>');
rawtable = pattern.sub(r'\1', rawtable)
我想要替换这样的东西:
<a href="javascript:rt(2061)">Download</a>
用这个:
2061
我想对此做同样的事情:
<a href="#" onclick="javascript:ra('Name of object one')"
title="Some title Text">Name of Object two</a>
只是
Name of Object two
做
pattern = re.compile('<a href="#" onclick="javascript:ra\('(:?[a-zA-Z0-9 +)'\)" title="Some title Text">([a-zA-Z0-9 ]+)</a>');
rawtable = pattern.sub(r'\1', rawtable)
但它也不起作用。有什么提示吗?
答案 0 :(得分:2)
我想要替换这样的东西:
<a href="javascript:rt(2061)">Download</a>
您的第一个代码有效。 Test here
我想对此做同样的事情:
<a href="#" onclick="javascript:ra('Name of object one')" title="Some title Text">Name of Object two</a>`
至于第二个,请查看我在这里标记的内容:
pattern = re.compile('<a href="#" onclick="javascript:ra\('(:?[a-zA-Z0-9 +)'\)" title="Some title Text">([a-zA-Z0-9 ]+)</a>');
| | | | ^ unescaped quote (in the string passed to re.compile() )
| | | |
| | ^---------^ you didn't close the character class (as in [a-z]).. add a "]"
| ^ correct syntax is (?: pattern ) ... However, no point in using it here
^ another unescaped quote
#python 3.4.3
import re;
rawtable = '<a href="#" onclick="javascript:ra(\'Name of object one\')" title="Some title Text">Name of Object two</a>';
pattern = re.compile('<a href="#" onclick="javascript:ra\(\'[a-zA-Z0-9 ]+\'\)" title="Some title Text">([a-zA-Z0-9 ]+)</a>');
rawtable = pattern.sub(r'\1', rawtable);
print(rawtable);
Name of Object two