尝试/除了跑,除非它不应该

时间:2015-06-07 11:52:11

标签: python python-2.7

我拼凑了一个网络抓取工具,它在网页上查找文本内容的替代x路径。在下面的示例中,它应该寻找两个备用x路径,并根据哪个路径可用,然后将其转换为变量issuer

try:
    xpath_issuer = ".//*[@id='dv_PRE88f496c28ad6488895f1ffc383fae8bd_list_list']/div/div[3]/table/tbody/tr[2]/td[2]"
    find_issuer = driver.find_element_by_xpath(xpath_issuer)
    issuer = re.search(r"(.+)", find_issuer.text).group()
except NoSuchElementException:
    pass
try:
    xpath_issuer = ".//*[@id='dv_PRE00e883469a264528b20fbbc31b0da4a2_list_list']/div/div[3]/table/tbody/tr[1]/td[2]/a"
    find_issuer = driver.find_element_by_xpath(xpath_issuer)
    issuer = re.search(r"(.+)", find_issuer.text).group()
except NoSuchElementException:
    pass

但实际上我会寻找许多xpath,而不仅仅是这两个。我试图将class Work():定义为缩短表达式的方法,因此我不必重复所有内容。

def crawl(x):
#Looks for variable Name here, omitted
    list_xpath_issuer = [".//*[@id='dv_PRE88f496c28ad6488895f1ffc383fae8bd_list_list']/div/div[3]/table/tbody/tr[2]/td[2]", ".//*[@id='dv_PRE00e883469a264528b20fbbc31b0da4a2_list_list']/div/div[3]/table/tbody/tr[1]/td[2]/a"]
    class Work():
        def __init__(self):
            y = self.getIssuer()
            print(y)

        def getIssuer(self):
            for i in range(len(list_xpath_issuer)):
                xpath_issuer = list_xpath_issuer[i]
                try:
                    find_issuer = driver.find_element_by_xpath(xpath_issuer)
                    issuer = re.search(r"(.+)", find_issuer.text).group().encode("utf-8")
                    print "Issuer: %s" % issuer
                    return "Xpath is %s" % xpath_issuer
                except NoSuchElementException:
                    print "This is an exception"
                    pass
    Work()
    return pd.Series([isin, instrument_name, issuer])
df[["Name", "Issuer"]] = df["ISIN"].apply(crawl)

问题在于,出于某种原因,最终issuer因此错误而显示为空:

  

NameError:未定义全局名称'issuer'

它确实找到了其中一个xpath,只要它在issuer阶段工作,就会将其传递给try,但由于某种原因,它还会运行except阶段,否定issuer中的值。有什么想法吗?

编辑:追溯

This is an exception
Issuer: Boost Issuer Plc
Xpath is .//*[@id='dv_PRE00e883469a264528b20fbbc31b0da4a2_list_list']/div/div[3]/table/tbody/tr[1]/td[2]/a
Traceback (most recent call last):
  File "xetra_lookup16.py", line 211, in <module>
    df[["Name", "Symbol", "Issuer"]] = df["ISIN"].apply(crawl)
  File "/lib/python2.7/site-packages/pandas/core/series.py", line 2053, in apply
    mapped = lib.map_infer(values, f, convert=convert_dtype)
  File "pandas/src/inference.pyx", line 1064, in pandas.lib.map_infer (pandas/lib.c:58519)
  File "xetra_lookup16.py", line 208, in crawl
    return pd.Series([isin, instrument_name, issuer])
NameError: global name 'issuer' is not defined

1 个答案:

答案 0 :(得分:1)

除了套房正常正常

问题在于您的异常处理。您正尝试在 skeletonNode = new CCSkeletonAnimation("Snake.json", "Snake.atlas"); skeletonNode->setAnimation("Walk", true); skeletonNode->setScale(1.0); skeletonNode->setSlotsToSetupPose(); CCSize windowSize = CCDirector::sharedDirector()->getWinSize(); skeletonNode->setPosition(ccp(windowSize.width / 2, windowSize.height/2)); addChild(skeletonNode); skeletonNode->release(); 函数中使用名称skeletonNode->setSlotsToSetupPose(); skeletonNode->addAnimation("Sleep", true); ,但从未在此处设置。

issuercrawl中的局部变量,而不是issuer方法中使用的相同变量。这两者完全不相关。设置一个不会使另一个出现。

当您创建crawl的实例时,它可能打印结果,但实际上没有任何内容传递给调用者。我不确定你为什么要在这里上课;你也可以只是内联这个方法:

Work.getIssuer()