我拼凑了一个网络抓取工具,它在网页上查找文本内容的替代x路径。在下面的示例中,它应该寻找两个备用x路径,并根据哪个路径可用,然后将其转换为变量issuer
。
try:
xpath_issuer = ".//*[@id='dv_PRE88f496c28ad6488895f1ffc383fae8bd_list_list']/div/div[3]/table/tbody/tr[2]/td[2]"
find_issuer = driver.find_element_by_xpath(xpath_issuer)
issuer = re.search(r"(.+)", find_issuer.text).group()
except NoSuchElementException:
pass
try:
xpath_issuer = ".//*[@id='dv_PRE00e883469a264528b20fbbc31b0da4a2_list_list']/div/div[3]/table/tbody/tr[1]/td[2]/a"
find_issuer = driver.find_element_by_xpath(xpath_issuer)
issuer = re.search(r"(.+)", find_issuer.text).group()
except NoSuchElementException:
pass
但实际上我会寻找许多xpath,而不仅仅是这两个。我试图将class Work():
定义为缩短表达式的方法,因此我不必重复所有内容。
def crawl(x):
#Looks for variable Name here, omitted
list_xpath_issuer = [".//*[@id='dv_PRE88f496c28ad6488895f1ffc383fae8bd_list_list']/div/div[3]/table/tbody/tr[2]/td[2]", ".//*[@id='dv_PRE00e883469a264528b20fbbc31b0da4a2_list_list']/div/div[3]/table/tbody/tr[1]/td[2]/a"]
class Work():
def __init__(self):
y = self.getIssuer()
print(y)
def getIssuer(self):
for i in range(len(list_xpath_issuer)):
xpath_issuer = list_xpath_issuer[i]
try:
find_issuer = driver.find_element_by_xpath(xpath_issuer)
issuer = re.search(r"(.+)", find_issuer.text).group().encode("utf-8")
print "Issuer: %s" % issuer
return "Xpath is %s" % xpath_issuer
except NoSuchElementException:
print "This is an exception"
pass
Work()
return pd.Series([isin, instrument_name, issuer])
df[["Name", "Issuer"]] = df["ISIN"].apply(crawl)
问题在于,出于某种原因,最终issuer
因此错误而显示为空:
NameError:未定义全局名称'issuer'
它确实找到了其中一个xpath,只要它在issuer
阶段工作,就会将其传递给try
,但由于某种原因,它还会运行except
阶段,否定issuer
中的值。有什么想法吗?
编辑:追溯
This is an exception
Issuer: Boost Issuer Plc
Xpath is .//*[@id='dv_PRE00e883469a264528b20fbbc31b0da4a2_list_list']/div/div[3]/table/tbody/tr[1]/td[2]/a
Traceback (most recent call last):
File "xetra_lookup16.py", line 211, in <module>
df[["Name", "Symbol", "Issuer"]] = df["ISIN"].apply(crawl)
File "/lib/python2.7/site-packages/pandas/core/series.py", line 2053, in apply
mapped = lib.map_infer(values, f, convert=convert_dtype)
File "pandas/src/inference.pyx", line 1064, in pandas.lib.map_infer (pandas/lib.c:58519)
File "xetra_lookup16.py", line 208, in crawl
return pd.Series([isin, instrument_name, issuer])
NameError: global name 'issuer' is not defined
答案 0 :(得分:1)
除了套房正常正常。
问题在于您的异常处理。您正尝试在 skeletonNode = new CCSkeletonAnimation("Snake.json", "Snake.atlas");
skeletonNode->setAnimation("Walk", true);
skeletonNode->setScale(1.0);
skeletonNode->setSlotsToSetupPose();
CCSize windowSize = CCDirector::sharedDirector()->getWinSize();
skeletonNode->setPosition(ccp(windowSize.width / 2, windowSize.height/2));
addChild(skeletonNode);
skeletonNode->release();
函数中使用名称skeletonNode->setSlotsToSetupPose();
skeletonNode->addAnimation("Sleep", true);
,但从未在此处设置。
issuer
是crawl
中的局部变量,而不是issuer
方法中使用的相同变量。这两者完全不相关。设置一个不会使另一个出现。
当您创建crawl
的实例时,它可能打印结果,但实际上没有任何内容传递给调用者。我不确定你为什么要在这里上课;你也可以只是内联这个方法:
Work.getIssuer()