我正在尝试使用phantomjs制作浏览机器人,但在某些情况下,它不够强大,无法满足我的需要,当某些请求失败时,没有选择重试它们。在那些场合中,我回应那些失败或可能失败的请求以及当时浏览器中的cookie。然后我在python脚本中获取信息并从中发出请求。我使用正则表达式从字符串中收集信息,然后继续使用pycurl来发出请求。我附加了处理下面的字符串的python函数。 当我在test.py脚本上单独使用它时,该函数效果很好,但是当我将它添加到主python脚本时它不起作用,即使解释器是同一个机器和文件夹,为什么会这样那会发生什么?
功能:
def getReqs(interface_text):
if("<van LAST_LOAD>" in interface_text):
interface_text=str(interface_text[interface_text.rfind("<van LAST_LOAD>"):])
cookie_req=re.findall(r"<van[^>]*?type='cookies'[^>]*?>([\s\S]*?)</van>[^<]*?<van[^>]*?type='link_taken'[^>]*?href='([^']*?)'>",interface_text)
topclicks=re.findall(r"<van[^>]*?type='top_request'[^>]*?href='([^']*?)'>",interface_text)
imgclicks=re.findall(r"<van[^>]*?type='image_request'[^>]*?href='([^']*?)'>",interface_text)
ind=list()
for d in cookie_req:
cooks=re.findall(r"([\S]*?)\t\t([\S]*?)\t\t([\S]*?)\t\t(\d+)",d[0])
rr=dict()
rr['cookies']=cooks
rr['request']=d[1].strip()
type_='image'
for d in topclicks:
if(rr['request']==d.strip()): type_='toplink'
rr['type']=type_
ind.append(rr)
return ind
else:
return False
STRING:
New URL: http://domain.com/
Request (http://domain.com/css/style.css):
Request (http://domain.com/tp/filter.php?pro=936):
Request (http://domain.com/tp/a_ft.php?rand=5):
<van LAST_LOAD>
Processing images and getting hidden ones
Request (http://domain.com/tp/img.php):
Images with width set to over 85 67
Done processing images.
Checking Resourse Status
Resourse retrieval status: Started/Full F http://domain.com/
Resourse retrieval status: Started/Full F http://domain.com/css/style.css
Resourse retrieval status: Started/Full F http://domain.com/tp/filter.php?pro=936
Resourse retrieval status: Started/Full F http://domain.com/tp/a_ft.php?rand=5
Resourse retrieval status: Started/Full F http://domain.com/tp/img.php
Phantom will exit in 33775
Reclicking
Clicking Image
Random Click: 5
<van type='image_request' href='http://www.domain.com/st/thumbs/238/YOWF8GaqIz.jpg'>
Dims: 204,514,240,180
Global mouse position 0 0
Moving to mouse to 635 295
mouse moved
Trying to navigate to: http://domain.com/gallery/www.html?id=437&x=8715eb135db63642cda1ec1c19e8d529&url=aHR0cDovL3d3dy5kcnR1YmVyLmNvbS92aWRlby8xOTE1MDEyL2FtYXRldXItcnVzc2lhbi1zZXgtdGFwZQ==&s=1
Caused by: LinkClicked
Will actually navigate: false
Sent from the page's main frame: false
Expected links: 5
<van type='cookies'>
domain.com proimg 93ffe5 1417031956
domain.com pro_cc3 394ef8df2b 1417031956
domain.com pro_cc2 3377058 1417031956
domain.com fav 1416945556 1448481556
domain.com tp MXwwfDE0MTY5NDU1NTZ8MTQxNjk0NTU1NnwwO3Rlc3QyMS5jb20= 1417031956
</van>
<van type='link_taken' href='http://domain.com/gallery/www.html?id=437&x=8715eb135db63642cda1ec1c19e8d529&url=aHR0cDovL3d3dy5kcnR1YmVyLmNvbS92aWRlby8xOTE1MDEyL2FtYXRldXItcnVzc2lhbi1zZXgtdGFwZQ==&s=1'>
Reclicking
Clicking Image
Random Click: 3
<van type='image_request' href='http://www.domain.com/st/thumbs/730/PGy0TRimJJ.jpg'>
Dims: 204,22,240,180
Global mouse position 635 295
Moving to mouse to 143 295
mouse moved
Trying to navigate to: http://domain.com/gallery/sss.html?id=424&x=e3ad16bcdc583a324acbc3a83f654a7a&url=aHR0cDovL3d3dy5kcnR1YmVyLmNvbS92aWRlby8xOTE2Mjk5L3RvdWNoaW5nLWJlYXV0eXMtanVpY3ktc3BvdA==&s=1
Caused by: LinkClicked
Will actually navigate: false
Sent from the page's main frame: false
Expected links: 4
<van type='cookies'>
domain.com proimg 93ffe5 1417031956
domain.com pro_cc3 394ef8df2b 1417031956
domain.com pro_cc2 3377058 1417031956
domain.com fav 1416945556 1448481556
domain.com tp MXwwfDE0MTY5NDU1NTZ8MTQxNjk0NTU1NnwwO3Rlc3QyMS5jb20= 1417031956
</van>
<van type='link_taken' href='http://domain.com/gallery/sss.html?id=424&x=e3ad16bcdc583a324acbc3a83f654a7a&url=aHR0cDovL3d3dy5kcnR1YmVyLmNvbS92aWRlby8xOTE2Mjk5L3RvdWNoaW5nLWJlYXV0eXMtanVpY3ktc3BvdA==&s=1'>
Reclicking
Clicking Image
Random Click: 7
<van type='image_request' href='http://www.domain.com/st/thumbs/867/uLzPrb0K45.jpg'>
Dims: 424,22,240,180
Global mouse position 143 295
Moving to mouse to 143 515
mouse moved
Trying to navigate to: http://domain.com/gallery/aaa.html?id=466&x=8dcbd277bf725b468c7933cc81692be0&url=aHR0cDovL3d3dy5kcnR1YmVyLmNvbS92aWRlby8xOTExMzQ0L3doaXRlLWFuZC1ibGFjay10ZWVuLWJhYmVzLW1hc3R1cmJhdGluZw==&s=1
Caused by: LinkClicked
Will actually navigate: false
Sent from the page's main frame: false
Expected links: 3
<van type='cookies'>
domain.com proimg 93ffe5 1417031956
domain.com pro_cc3 394ef8df2b 1417031956
domain.com pro_cc2 3377058 1417031956
domain.com fav 1416945556 1448481556
domain.com tp MXwwfDE0MTY5NDU1NTZ8MTQxNjk0NTU1NnwwO3Rlc3QyMS5jb20= 1417031956
</van>
<van type='link_taken' href='http://domain.com/gallery/aaa.html?id=466&x=8dcbd277bf725b468c7933cc81692be0&url=aHR0cDovL3d3dy5kcnR1YmVyLmNvbS92aWRlby8xOTExMzQ0L3doaXRlLWFuZC1ibGFjay10ZWVuLWJhYmVzLW1hc3R1cmJhdGluZw==&s=1'>
另一方面,此代码返回一个空列表。
#!/usr/bin/python
#mysql* MySQL*
__author__ = 'root'
import MySQLdb
import sys
import random
import subprocess
import re
import time
import pycurl
import cStringIO
import tldextract
def mergeCookies(cookieList,cookieFile):
data = open(cookieFile,'r').read()
precooks=re.findall(ur"([\S]*?)\t([\S]*?)\t([\S]*?)\t([\S]*?)\t([\S]*?)\t([\S]*?)\t([\S]+)",data)
total="""# Netscape HTTP Cookie File
# http://curl.haxx.se/rfc/cookie_spec.html
# This file was generated by libcurl! Edit at your own risk.
"""
keeper= list()
for old in precooks:
refresh=False
for new in cookieList:
print str(old[0]).strip()
new_parse=tldextract.extract(new[0])
old_parse=tldextract.extract(old[0])
if (new_parse[1].strip()==old_parse[1].strip() and str(new[1]).strip()==str(old[5]).strip() and not(str(old[0]).strip()+str(old[5]).strip() in keeper or str(new[0]).strip()+str(new[1]).strip() in keeper)):
total+=str(old[0]).strip()+"\t"+"TRUE"+"\t"+"/\tFALSE\t1579998218\t"+str(new[1]).strip()+"\t"+str(new[2]).strip()+"\n"
keeper.append(str(old[0]).strip()+str(old[5]).strip())
keeper.append(str(new[0]).strip()+str(new[1]).strip())
refresh=True
if(not refresh):
total+=str(old[0]).strip()+"\t"+"TRUE"+"\t"+"/\tFALSE\t1579998218\t"+str(old[5]).strip()+"\t"+str(old[6]).strip()+"\n"
for new in cookieList:
if(not(str(new[0]).strip()+str(new[1]).strip() in keeper)):
total+=str(new[0]).strip()+"\t"+"TRUE"+"\t"+"/\tFALSE\t1579998218\t"+str(new[1]).strip()+"\t"+str(new[2]).strip()+"\n"
keeper.append(str(new[0]).strip()+str(new[1]).strip())
open(cookieFile,'w').write(total)
def hitFormGetProxy(url,cookieFile,cookieList,proxy,lang,agent,referer,type_,theCol):
times=0
mergeCookies(cookieList,cookieFile)
while True:
times+=1
c = pycurl.Curl()
buff = cStringIO.StringIO()
c.setopt(c.URL, url)
c.setopt(c.WRITEFUNCTION, buff.write)
c.setopt(c.COOKIEFILE, cookieFile)
c.setopt(c.COOKIEJAR, cookieFile)
c.setopt(c.AUTOREFERER, True)
#c.setopt(c.COOKIESESSION, True)
#c.setopt(c.COOKIE, cookieString)
c.setopt(c.FAILONERROR, False)
c.setopt(c.FOLLOWLOCATION, True)
c.setopt(c.VERBOSE, True)
c.setopt(c.PROXY, proxy)
c.setopt(c.CONNECTTIMEOUT, 10)
c.setopt(c.TIMEOUT, 25)
c.setopt(c.MAXREDIRS, 10)
c.setopt(c.ENCODING, 'gzip,deflate,sdch')
c.setopt(c.SSL_VERIFYHOST, False)
c.setopt(c.SSL_VERIFYPEER, False)
c.setopt(c.FRESH_CONNECT, True)
c.setopt(c.HEADER, False)
c.setopt(c.HTTPHEADER, ['Accept-Language: '+str(lang)+'','Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8','Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3'])
#c.setopt(c.RETURNTRANSFER, True)
c.setopt(c.USERAGENT, agent)
c.setopt(c.REFERER, referer)
#c.setopt(c.HTTPHEADER, ['Accept: text/html', 'Accept-Charset: UTF-8'])
c.perform()
if(not (c.getinfo(pycurl.HTTP_CODE) == 200 or c.getinfo(pycurl.HTTP_CODE)==302 or c.getinfo(pycurl.HTTP_CODE)==301) and times>7):
if (type_ != 'payed'):
print "setting proxy offline"
# cur.execute("UPDATE `proxies` SET `status`='inactive',`last_checked`='"+str(int(time.time()))+"' WHERE `proxy`='"+str(proxy)+"'")
# cur.execute("UPDATE `proxies` SET `"+str(theCol)+"` = '"+str(int(time.time()))+"',`connections`= `connections`-1 WHERE `proxies`.`proxy` = '"+str(proxy)+"';")
quit()
elif(len(buff.getvalue())>500):
unallowed=False
global unallowed_urls
dmain=tldextract.extract(c.getinfo(pycurl.EFFECTIVE_URL))
for url in unallowed_urls:
dmainurl=tldextract.extract(url)
if(dmain[1].strip()==dmainurl[1].strip()):
unallowed=True
if(not unallowed):
ret=buff.getvalue()
buff.close()
return ret
else:
print "visiting unallowed url"
break;
elif(times>12):break
def getReqs(interface_text):
if("<van LAST_LOAD>" in interface_text):
interface_text=str(interface_text[interface_text.rfind("<van LAST_LOAD>"):])
cookie_req=re.findall(r"<van[^>]*?type='cookies'[^>]*?>([\s\S]*?)</van>[^<]*?<van[^>]*?type='link_taken'[^>]*?href='([^']*?)'>",interface_text)
topclicks=re.findall(r"<van[^>]*?type='top_request'[^>]*?href='([^']*?)'>",interface_text)
imgclicks=re.findall(r"<van[^>]*?type='image_request'[^>]*?href='([^']*?)'>",interface_text)
ind=list()
for d in cookie_req:
cooks=re.findall(r"([\S]*?)\t\t([\S]*?)\t\t([\S]*?)\t\t(\d+)",d[0])
rr=dict()
rr['cookies']=cooks
rr['request']=d[1].strip()
type_='image'
for d in topclicks:
if(rr['request']==d.strip()): type_='toplink'
rr['type']=type_
ind.append(rr)
return ind
else:
return False
def escapeshellarg(arg):
"""
:param arg:
:return: escaped string for ussage as console argument
"""
return "\\'".join("'" + p + "'" for p in arg.split("'"))
#output = (Popen(["/usr/bin/java", "-jar", os.path.dirname(os.path.realpath(__file__))+"/headFinder.jar", self.escapeshellarg(str(tree))], stdout=PIPE).communicate()[0]).strip('')
def getSite(a):
file_ = open('bot'+str(a)+'.ini','r').read()
p = re.compile(ur'REFERER:([^;]*?);')
m = re.search(p, file_)
toReturn = m.group(1)
return str(toReturn).strip()
def proxy_status(str):
p = re.compile(ur'<van[^>]*?name=\'proxy_status\'[^>]*?value=\'([^\']*?)\'[^>]*?>')
m = re.search(p, str)
toReturn = m.group(1)
return toReturn
def random_tier(a):
data = open(a,'r').read()
data = data.split("}")
probs = data[1].strip().split('|')
num=random.randint(0,100)
totes=0
toReturn = ''
for x in range(0,len(probs)-1):
if(num>totes and num<= totes + int(probs[x].strip())): toReturn = data[x+2]
totes+=int(probs[x].strip())
return toReturn.strip()
def Random_Lang():
data = open('language.txt','r').read()
data = data.split("}")
probs = data[1].strip().split('|')
num=random.randint(0,100)
totes=0
toReturn = ''
for x in range(0,len(probs)-1):
if(num>totes and num<= totes + int(probs[x].strip())): toReturn = data[x+2]
totes+=int(probs[x].strip())
return toReturn.strip()
def Random_Agent():
num=random.randint(0,100)
if(num<16) : return random_tier("IE.txt")
elif(num>16 and num<=48) : return random_tier("firefox.txt")
elif(num>48 and num<=93) : return random_tier("CHROME.txt")
elif(num>93 and num<=97) : return random_tier("safari.txt")
elif(num>97 and num<=100) : return random_tier("opera.txt")
def Get_Trade(cur,colnum,threadnum):
print "SELECT * FROM trades_"+str(threadnum)+" WHERE position = '"+str(colnum)+"'"
cur.execute("SELECT * FROM trades_"+str(threadnum)+" WHERE position = '"+str(colnum)+"'")
try :
if (cur.rowcount > 0):
fetch = cur.fetchall()
return fetch[0][1],fetch[0][2]
else:
print "Found No Trade In That Position !"
time.sleep(8)
quit()
except MySQLdb.Error, e:
try:
print "MySQL Error [%d]: %s" % (e.args[0], e.args[1])
except IndexError:
print "MySQL Error: %s" % str(e)
time.sleep(8)
quit()
def GetPayedProxy(cur,theCol):
print "SELECT * FROM `proxies` WHERE `"+str(theCol)+"`<'"+str(int(time.time()) - 86400)+"' and `status`='active' and `response`='200' and `PAYMENT`='sharedproxies' and `connections`<3"
cur.execute("SELECT * FROM `proxies` WHERE `"+str(theCol)+"`<'"+str(int(time.time()) - 86400)+"' and `status`='active' and `response`='200' and `PAYMENT`='sharedproxies' and `connections`<3")
try :
if (cur.rowcount > 0):
fetch = cur.fetchall()
return fetch[0][0],'payed'
else:
print "Found No Shared Proxies available at this time !"
time.sleep(2)
return False,False
except MySQLdb.Error, e:
try:
print "MySQL Error [%d]: %s" % (e.args[0], e.args[1])
except IndexError:
print "MySQL Error: %s" % str(e)
time.sleep(2)
return False,False
def GetScannedProxy(cur,theCol):
print "SELECT * FROM `proxies` WHERE `"+str(theCol)+"`<'"+str(int(time.time()) - 86400)+"' and `status`='active' and `response`='200' and `PAYMENT`='scanner' and `connections`<3"
cur.execute("SELECT * FROM `proxies` WHERE `"+str(theCol)+"`<'"+str(int(time.time()) - 86400)+"' and `status`='active' and `response`='200' and `PAYMENT`='scanner' and `connections`<3")
try :
if (cur.rowcount > 0):
fetch = cur.fetchall()
return fetch[0][0],'scanned'
else:
print "Found No Scanned Proxies available at this time !"
time.sleep(2)
return False,False
except MySQLdb.Error, e:
try:
print "MySQL Error [%d]: %s" % (e.args[0], e.args[1])
except IndexError:
print "MySQL Error: %s" % str(e)
time.sleep(2)
return False,False
def GetTTProxy(cur,theCol):
print "SELECT * FROM `proxies` WHERE `"+str(theCol)+"`<'"+str(int(time.time()) - 86400)+"' and `status`='active' and `response`='200' and (`tier`='1' or `tier`='2') and `response_time`<10 and `PAYMENT`!='sharedproxies' and `PAYMENT`!='scanner' and `connections`<3"
cur.execute("SELECT * FROM `proxies` WHERE `"+str(theCol)+"`<'"+str(int(time.time()) - 86400)+"' and `status`='active' and `response`='200' and (`tier`='1' or `tier`='2') and `response_time`<10 and `PAYMENT`!='sharedproxies' and `PAYMENT`!='scanner' and `connections`<3")
try :
if (cur.rowcount > 0):
fetch = cur.fetchall()
return fetch[0][0],'tt'
else:
print "Found No T1 T2 Proxies available at this time !"
time.sleep(2)
return False,False
except MySQLdb.Error, e:
try:
print "MySQL Error [%d]: %s" % (e.args[0], e.args[1])
except IndexError:
print "MySQL Error: %s" % str(e)
time.sleep(2)
return False,False
def GetT3Proxy(cur,theCol):
print "SELECT * FROM `proxies` WHERE `"+str(theCol)+"`<'"+str(int(time.time()) - 86400)+"' and `status`='active' and `response`='200' and `tier`='3' and `response_time`<10 and `PAYMENT`!='sharedproxies' and `PAYMENT`!='scanner' and `connections`<3"
cur.execute("SELECT * FROM `proxies` WHERE `"+str(theCol)+"`<'"+str(int(time.time()) - 86400)+"' and `status`='active' and `response`='200' and `tier`='3' and `response_time`<10 and `PAYMENT`!='sharedproxies' and `PAYMENT`!='scanner' and `connections`<3")
try :
if (cur.rowcount > 0):
fetch = cur.fetchall()
return fetch[0][0],'t3'
else:
print "Found No T3 Proxies available at this time !"
time.sleep(2)
return False,False
except MySQLdb.Error, e:
try:
print "MySQL Error [%d]: %s" % (e.args[0], e.args[1])
except IndexError:
print "MySQL Error: %s" % str(e)
time.sleep(2)
return False,False
def Get_Proxy(cur,theCol):
print "Trying to get Shared Proxy"
proxy,type=GetPayedProxy(cur,theCol)
if(proxy==False or type == False):
print "Trying to get Scanned Proxy"
proxy,type=GetScannedProxy(cur,theCol)
if(proxy==False or type == False):
print "Trying to get T1 T2 Proxy"
proxy,type=GetTTProxy(cur,theCol)
if(proxy==False or type == False):
print "Trying to get T3 Proxy"
proxy,type=GetT3Proxy(cur,theCol)
if(proxy==False or type == False):
print "No proxies available at this time!!!"
else:
return proxy,type
else:
return proxy,type
else:
return proxy,type
else:
return proxy,type
def getReqs(interface_text):
toReturn = dict()
return toReturn
if __name__=='__main__':
data="""New URL: http://domain.com/
Request (http://domain.com/css/style.css):
Request (http://domain.com/tp/filter.php?pro=936):
Request (http://domain.com/tp/a_ft.php?rand=5):
<van LAST_LOAD>
Processing images and getting hidden ones
Request (http://domain.com/tp/img.php):
Images with width set to over 85 67
Done processing images.
Checking Resourse Status
Resourse retrieval status: Started/Full F http://domain.com/
Resourse retrieval status: Started/Full F http://domain.com/css/style.css
Resourse retrieval status: Started/Full F http://domain.com/tp/filter.php?pro=936
Resourse retrieval status: Started/Full F http://domain.com/tp/a_ft.php?rand=5
Resourse retrieval status: Started/Full F http://domain.com/tp/img.php
Phantom will exit in 33775
Reclicking
Clicking Image
Random Click: 5
<van type='image_request' href='http://www.domain.com/st/thumbs/238/YOWF8GaqIz.jpg'>
Dims: 204,514,240,180
Global mouse position 0 0
Moving to mouse to 635 295
mouse moved
Trying to navigate to: http://domain.com/gallery/www.html?id=437&x=8715eb135db63642cda1ec1c19e8d529&url=aHR0cDovL3d3dy5kcnR1YmVyLmNvbS92aWRlby8xOTE1MDEyL2FtYXRldXItcnVzc2lhbi1zZXgtdGFwZQ==&s=1
Caused by: LinkClicked
Will actually navigate: false
Sent from the page's main frame: false
Expected links: 5
<van type='cookies'>
domain.com proimg 93ffe5 1417031956
domain.com pro_cc3 394ef8df2b 1417031956
domain.com pro_cc2 3377058 1417031956
domain.com fav 1416945556 1448481556
domain.com tp MXwwfDE0MTY5NDU1NTZ8MTQxNjk0NTU1NnwwO3Rlc3QyMS5jb20= 1417031956
</van>
<van type='link_taken' href='http://domain.com/gallery/www.html?id=437&x=8715eb135db63642cda1ec1c19e8d529&url=aHR0cDovL3d3dy5kcnR1YmVyLmNvbS92aWRlby8xOTE1MDEyL2FtYXRldXItcnVzc2lhbi1zZXgtdGFwZQ==&s=1'>
Reclicking
Clicking Image
Random Click: 3
<van type='image_request' href='http://www.domain.com/st/thumbs/730/PGy0TRimJJ.jpg'>
Dims: 204,22,240,180
Global mouse position 635 295
Moving to mouse to 143 295
mouse moved
Trying to navigate to: http://domain.com/gallery/sss.html?id=424&x=e3ad16bcdc583a324acbc3a83f654a7a&url=aHR0cDovL3d3dy5kcnR1YmVyLmNvbS92aWRlby8xOTE2Mjk5L3RvdWNoaW5nLWJlYXV0eXMtanVpY3ktc3BvdA==&s=1
Caused by: LinkClicked
Will actually navigate: false
Sent from the page's main frame: false
Expected links: 4
<van type='cookies'>
domain.com proimg 93ffe5 1417031956
domain.com pro_cc3 394ef8df2b 1417031956
domain.com pro_cc2 3377058 1417031956
domain.com fav 1416945556 1448481556
domain.com tp MXwwfDE0MTY5NDU1NTZ8MTQxNjk0NTU1NnwwO3Rlc3QyMS5jb20= 1417031956
</van>
<van type='link_taken' href='http://domain.com/gallery/sss.html?id=424&x=e3ad16bcdc583a324acbc3a83f654a7a&url=aHR0cDovL3d3dy5kcnR1YmVyLmNvbS92aWRlby8xOTE2Mjk5L3RvdWNoaW5nLWJlYXV0eXMtanVpY3ktc3BvdA==&s=1'>
Reclicking
Clicking Image
Random Click: 7
<van type='image_request' href='http://www.domain.com/st/thumbs/867/uLzPrb0K45.jpg'>
Dims: 424,22,240,180
Global mouse position 143 295
Moving to mouse to 143 515
mouse moved
Trying to navigate to: http://domain.com/gallery/aaa.html?id=466&x=8dcbd277bf725b468c7933cc81692be0&url=aHR0cDovL3d3dy5kcnR1YmVyLmNvbS92aWRlby8xOTExMzQ0L3doaXRlLWFuZC1ibGFjay10ZWVuLWJhYmVzLW1hc3R1cmJhdGluZw==&s=1
Caused by: LinkClicked
Will actually navigate: false
Sent from the page's main frame: false
Expected links: 3
<van type='cookies'>
domain.com proimg 93ffe5 1417031956
domain.com pro_cc3 394ef8df2b 1417031956
domain.com pro_cc2 3377058 1417031956
domain.com fav 1416945556 1448481556
domain.com tp MXwwfDE0MTY5NDU1NTZ8MTQxNjk0NTU1NnwwO3Rlc3QyMS5jb20= 1417031956
</van>
<van type='link_taken' href='http://domain.com/gallery/aaa.html?id=466&x=8dcbd277bf725b468c7933cc81692be0&url=aHR0cDovL3d3dy5kcnR1YmVyLmNvbS92aWRlby8xOTExMzQ0L3doaXRlLWFuZC1ibGFjay10ZWVuLWJhYmVzLW1hc3R1cmJhdGluZw==&s=1'>"""
print getReqs(data)
quit()
答案 0 :(得分:1)
您可以在第103行定义getReqs
功能。
然后,在第287行,你用这个定义替换了这个定义:
def getReqs(interface_text):
toReturn = dict()
return toReturn
所以,当你在第395行打电话时:
print getReqs(data)
...你正在调用第二个定义,所以你打印出一个空字典并不奇怪。