使用python意外结束正则表达式

时间:2013-12-09 18:52:14

标签: python regex web-scraping mysql-python

我正试图从雅虎那里榨取股价!根据{{​​3}}的教程资助到本地数据库,并且在尝试执行此代码时我不断收到上述错误。谁能告诉我这里有什么问题?感谢。

from threading import Thread
import urllib
import re
import MySQLdb

gmap = {}

def th(ur):
    base = "http://finance.yahoo.com/q?s="+ur
    regex = '<span id="yfs_l84_'+ur.lower()+'">(.+?)</span>'
    pattern = re.compile(regex)
    htmltext = urllib.urlopen(base).read()
    results = re.findall(pattern, htmltext)
    try:
        gmap[ur] = results[0]
    except:
        print "Got an error"

symbolslist = open("multithread/stocks.txt").read()
symbolslist = symbolslist.replace(" ","").split(",")

print symbolslist

threadlist = []

for u in symbolslist:
    t = Thread(target=th,args=(u,))
    t.start()
    threadlist.append(t)

for b in threadlist:
    b.join()

这是我得到的确切错误:

Exception in thread Thread-1:
Traceback (most recent call last):
  File "C:\Python27\lib\threading.py", line 810, in __bootstrap_inner
    self.run()
  File "C:\Python27\lib\threading.py", line 763, in run
    self.__target(*self.__args, **self.__kwargs)
  File "multithread/threads.py", line 11, in th
    pattern = re.compile(regex)
  File "C:\Python27\lib\re.py", line 190, in compile
    return _compile(pattern, flags)
  File "C:\Python27\lib\re.py", line 242, in _compile
    raise error, v # invalid expression
error: unexpected end of regular expression

1 个答案:

答案 0 :(得分:0)

唉,你没有告诉我们重要的部分。也就是说,打印symbolslist。当您将其粘贴到<span ...样板中时,该列表中的 Something 会创建无效的正则表达式。

您可以通过更改该行来修复它:

    regex = '<span id="yfs_l84_' + re.escape(ur.lower()) + '">(.+?)</span>'
                                   ^^^^^^^^^^          ^

但是,如果可行的话,它可能只会隐藏真正的问题。真正的问题可能是你在symbolslist中有某种废话。