在下面的示例中,我无法通过lib re
获取数据我做错了什么?
#!/usr/bin/env python
# -*- coding:utf-8 -*-
import urllib
import re
def getData():
res=urllib.urlopen("http://www.quanshuwang.com/book/0/149/34333.html").read()
html = res.decode("gbk").encode("utf-8")
reg = r'style5\(\);</script>(.*?)<script type="text/javascript">style6'
print re.findall(reg,html)
getData()
答案 0 :(得分:0)
你有
reg = r'style5\(\);</script>(.*?)<script type="text/javascript">style6'
我认为你的问题是关于synthax:
reg = 'style5\(\);</script>(.*?)<script type="text/javascript">style6'
您必须删除额外的"r"