以下python代码包含三个进程:
过程1:调用connect_machines()
(实际上是执行pwd命令)
过程2:调用get_machines()
(实际上是读取一个大文件)
过程3:做与短语1相同的事情
第3步的时间成本比第1步大得多
conten_big.txt
文件是包含json数据的文件,其大小为39M
。
当我运行main()
函数时,end_time2 - start2
的值为22.04s
,而end_time1 - start1
的值为08.51s
。
当我注释#machines_a = get_machines()
行,然后运行main函数时,end_time1 - start1
的值几乎等于end_time2 - start2
import sys
import pdb
import os
import json
import time
import datetime
import logging
import commands
def get_logger(logger_name):
"""configger the logger """
logging.basicConfig(level = logging.INFO, \
format = '%(asctime)s - %(name)s - %(levelname)s - %(message)s',
datefmt='%a, %d %b %Y %H:%M:%S',
filename='./log/%s.log' % logger_name,
filemode='w')
logger = logging.getLogger(logger_name)
return logger
logger = get_logger('test_log')
def get_machines():
print 'get machines start'
fp = open('./conten_big.txt', 'r')
machines = fp.read()
fp.close()
machines = json.loads(machines)
print 'get machines have finished',len(machines)
return machines
def connect_machines(loop_count):
for idex in range(0, loop_count):
connect_port(idex)
def connect_port(idex):
ret2 = 0
cmd = 'pwd'
start_time=datetime.datetime.now()
(status, msg) = commands.getstatusoutput(cmd)
end_time=datetime.datetime.now()
cost = str(end_time-start_time)
logger.info("[%d] --[%d] -- [%s] %s" % (idex, status, msg, cost))
def main(argv):
"""main """
nowTime=datetime.datetime.now()
print nowTime.strftime('%Y-%m-%d %H:%M:%S')
machine_count = 5000
logger.info("=====================>>>>")
start1=datetime.datetime.now()
print start1.strftime('%Y-%m-%d %H:%M:%S')
connect_machines(machine_count)
end_time1=datetime.datetime.now()
print end_time1.strftime('%Y-%m-%d %H:%M:%S')
logger.info("[%s] --- [%s] ---[%s]" % (end_time1, start1, end_time1 - start1))
print end_time1, start1, end_time1 - start1
#read one big file, eg. a file size 39M
machines_a = get_machines()
logger.info("=====================")
time.sleep(30)
start2=datetime.datetime.now()
print start2.strftime('%Y-%m-%d %H:%M:%S')
connect_machines(machine_count)
end_time2=datetime.datetime.now()
print end_time2.strftime('%Y-%m-%d %H:%M:%S')
logger.info("[%s] --- [%s] ---[%s]" % (end_time2, start2, end_time2 - start2))
print end_time2, start2, end_time2 - start2
if __name__ == '__main__':
main(sys.argv)
答案 0 :(得分:0)
程序之所以要花时间,是因为文件很大(〜40MB),正如您之前所说的,注释get_machines()
可以大大减少执行时间。
将end_time1 - start1
与end_time2 - start2
进行比较是没有意义的,因为只有5000次迭代的for循环比读取非常大的文件要快得多,因为要处理大量的二进制数据需要更长的时间