应用错误收集

我正在尝试＆＃39;在python＆＃39;中抓取网页。问题是我的re模块出错了。这是我的代码：

#Python27

import re
import urllib

htm = urllib.urlopen('http://www.4shared.com/')
wp = htm.read()
begin = []
for start_tag in wp:
    x = re.search('<a', wp)
    begin.append(x.end())

这里出现错误信息：

Traceback (most recent call last):
  File "E:/Python Essen/Python27/crawling_web_pratice.py", line 23, in <module>
    a = re.search(start_tag, wp)
  File "C:\Python27\lib\re.py", line 146, in search
    return _compile(pattern, flags).search(string)
  File "C:\Python27\lib\re.py", line 251, in _compile
    raise error, v # invalid expression
error: nothing to repeat

请问如何避免错误？

尝试抓取网络时避免错误

0 个答案: