我有一个Python代码,我在一个类的函数中设置了一些变量值。现在我需要在函数外部获取设置值并使用它们。但是我没有得到设定值,而是我在声明它们时设置的值。这是我的代码:
from datetime import datetime
import MySQLdb
from scrapy import signals
from twisted.internet.task import LoopingCall
class SpiderDetails(object):
#"""Extension for collect spider information like start/stop time."""
update_interval = 5 # in seconds
spiderStartTime = ''
spiderStopTime = ''
spiderUpdateTime = ''
def __init__(self, crawler):
# keep a reference to the crawler in case is needed to access to more information
self.crawler = crawler
# keep track of polling calls per spider
self.pollers = {}
@classmethod
def from_crawler(cls, crawler):
instance = cls(crawler)
crawler.signals.connect(instance.spider_opened, signal=signals.spider_opened)
crawler.signals.connect(instance.spider_closed, signal=signals.spider_closed)
return instance
def spider_opened(self, spider):
# store curent timestamp in db as 'start time' for this spider
# TODO: complete db calls
spiderStartTime = datetime.now()
spiderStartTime = spiderStartTime.strftime("%Y-%m-%d %H:%M:%S")
print spiderStartTime
# start activity poller
poller = self.pollers[spider.name] = LoopingCall(self.spider_update, spider)
poller.start(self.update_interval)
def spider_closed(self, spider, reason):
spiderStopTime = datetime.now()
spiderStopTime = spiderStopTime.strftime("%Y-%m-%d %H:%M:%S")
print spiderStopTime
# store curent timestamp in db as 'end time' for this spider
# TODO: complete db calls
# remove and stop activity poller
poller = self.pollers.pop(spider.name)
poller.stop()
def spider_update(self, spider):
spiderUpdateTime = datetime.now()
spiderUpdateTime = spiderUpdateTime.strftime("%Y-%m-%d %H:%M:%S")
print spiderUpdateTime
# update 'last update time' for this spider
# TODO: complete db calls
#pass
# Open database connection
print spiderStopTime
db = MySQLdb.connect("localhost","root","","numismatics")
# prepare a cursor object using cursor() method
cursor = db.cursor()
# Prepare SQL query to INSERT a record into the database.
#sql = "INSERT INTO test(ID, startDate) VALUES ('', spider_start)"
try:
# Execute the SQL command
cursor.execute("INSERT INTO crawlertimes (`ID`, `spiderStartTime`, `spiderStopTime`, `spiderUpdateTime`) VALUES (%s,%s,%s,%s)",('',spiderStartTime,spiderStopTime,spiderUpdateTime))
# Commit your changes in the database
db.commit()
except:
# Rollback in case there is any error
db.rollback()
# disconnect from server
db.close()
在这段代码中,我在函数spider_closed中设置变量spiderStopTime
,但是当我将它打印在print语句中的所有函数之外时,它变为空白。如何获得更改后的值?
答案 0 :(得分:1)
如果这些值是实例上的属性,则在self
:
def spider_opened(self, spider):
self.spiderStartTime = datetime.now()
self.spiderStartTime = spiderStartTime.strftime("%Y-%m-%d %H:%M:%S")
print self.spiderStartTime`
如果你需要它们作为全局变量,那么你必须在方法本身中用global spiderStartTime
标记它们。
当类加载时,将执行代码的后半部分,即定义与数据库的连接。该代码在任何抓取之前运行,spiderStopTime
仍被定义为此时的空字符串。
将该代码移至spider_closed()
方法。这就是蜘蛛被关闭的点,你实际上记录了停止时间:
def spider_closed(self, spider, reason):
spiderStopTime = datetime.now()
spiderStopTime = spiderStopTime.strftime("%Y-%m-%d %H:%M:%S")
# remove and stop activity poller
poller = self.pollers.pop(spider.name)
poller.stop()
db = MySQLdb.connect("localhost","root","","numismatics")
cursor = db.cursor()
try:
cursor.execute("INSERT INTO crawlertimes (ID, spiderStartTime, spiderStopTime, spiderUpdateTime) VALUES (%s,%s,%s,%s)",
('', self.spiderStartTime, self.spiderStopTime, self.spiderUpdateTime))
db.commit()
except Exception:
db.rollback()
db.close()
答案 1 :(得分:0)
问题是spiderStopTime是函数的局部变量,一旦函数停止执行,垃圾收集就会启动。
为什么不在函数末尾返回spiderStopTime的值?
return spiderStopTime
当您调用该函数时,您将获得该值。
答案 2 :(得分:0)
您需要使用self
来访问这些变量,例如:
self.spiderStopTime = datetime.now()