我有一个sqlite数据库,可以通过python和sqlalchemy访问。
我有这个功能:
def _try_db_reads(runner, cfg):
# Get last time seen.
max_so_far = 0
BM = BinManager(cfg.bin_dir,
None,
history=cfg.max_days,
max_bins=cfg.max_bins,
min_members=cfg.min_bin_members,
backsteps=cfg.hillclimb_backsteps)
BM.load_bins()
for i, (bin_id, users) in enumerate(BM.iterbins()):
t0 = time.time()
df = runner.DBM.get_df_daily_data_per_users(users)
t1 = time.time()
if t1 - t0 > max_so_far:
max_so_far = t1-t0
logger.info('for bin '+ str(bin_id) + ' of n_users=' + str(len(users)) + ' read df of shape ' + str(df.shape) + ' in ' + str(t1-t0) + ' max so far ' + str(max_so\
_far))
在这里,for循环在大约100个用户的组上进行迭代,并且每次都在获取数据。这些组是随机生成的,每个用户在db上的数据量应该大致相同。
以下是用于从db读取的函数:
def get_df_daily_data_per_users(self, users):
logger.info("Getting all daily data for %d users from %s." %
(len(users), self.daily_table.__tablename__))
session = self._get_session()
query = session.query(self.daily_table).filter(self.daily_table.user.in_(users))
df = pd.read_sql(query.statement, query.session.bind)
session.close()
logger.info("Daily data query complete, %d rows of data returned." % df.shape[0])
return df
,这里是生成的日志的一部分:
2018-11-14 19:12:19 [INFO] corvil_mlcne.user_recognition.run_ur: for bin 0 of n_users=104 read df of shape (319074, 4) in 5.26866698265 max so far 5.26866698265
2018-11-14 19:12:19 [INFO] corvil_mlcne.user_recognition.database_tools: Getting all daily data for 104 users from daily_user_website.
2018-11-14 19:12:22 [INFO] corvil_mlcne.user_recognition.database_tools: Daily data query complete, 320980 rows of data returned.
2018-11-14 19:12:22 [INFO] corvil_mlcne.user_recognition.run_ur: for bin 1 of n_users=104 read df of shape (320980, 4) in 2.64458298683 max so far 5.26866698265
2018-11-14 19:12:22 [INFO] corvil_mlcne.user_recognition.database_tools: Getting all daily data for 104 users from daily_user_website.
2018-11-14 19:12:24 [INFO] corvil_mlcne.user_recognition.database_tools: Daily data query complete, 317565 rows of data returned.
2018-11-14 19:12:24 [INFO] corvil_mlcne.user_recognition.run_ur: for bin 2 of n_users=104 read df of shape (317565, 4) in 2.48706793785 max so far 5.26866698265
2018-11-14 19:12:24 [INFO] corvil_mlcne.user_recognition.database_tools: Getting all daily data for 104 users from daily_user_website.
2018-11-14 19:12:26 [INFO] corvil_mlcne.user_recognition.database_tools: Daily data query complete, 317662 rows of data returned.
2018-11-14 19:12:26 [INFO] corvil_mlcne.user_recognition.run_ur: for bin 3 of n_users=104 read df of shape (317662, 4) in 2.27176904678 max so far 5.26866698265
2018-11-14 19:12:26 [INFO] corvil_mlcne.user_recognition.database_tools: Getting all daily data for 104 users from daily_user_website.
2018-11-14 19:12:29 [INFO] corvil_mlcne.user_recognition.database_tools: Daily data query complete, 319764 rows of data returned.
2018-11-14 19:12:29 [INFO] corvil_mlcne.user_recognition.run_ur: for bin 4 of n_users=104 read df of shape (319764, 4) in 2.42617821693 max so far 5.26866698265
2018-11-14 19:12:29 [INFO] corvil_mlcne.user_recognition.database_tools: Getting all daily data for 104 users from daily_user_website.
2018-11-14 19:12:31 [INFO] corvil_mlcne.user_recognition.database_tools: Daily data query complete, 314175 rows of data returned.
2018-11-14 19:12:31 [INFO] corvil_mlcne.user_recognition.run_ur: for bin 5 of n_users=104 read df of shape (314175, 4) in 2.26107311249 max so far 5.26866698265
2018-11-14 19:12:31 [INFO] corvil_mlcne.user_recognition.database_tools: Getting all daily data for 104 users from daily_user_website.
2018-11-14 19:12:33 [INFO] corvil_mlcne.user_recognition.database_tools: Daily data query complete, 308365 rows of data returned.
2018-11-14 19:12:33 [INFO] corvil_mlcne.user_recognition.run_ur: for bin 6 of n_users=104 read df of shape (308365, 4) in 2.14715003967 max so far 5.26866698265
2018-11-14 19:12:33 [INFO] corvil_mlcne.user_recognition.database_tools: Getting all daily data for 104 users from daily_user_website.
2018-11-14 19:12:35 [INFO] corvil_mlcne.user_recognition.database_tools: Daily data query complete, 312768 rows of data returned.
2018
尽管通话非常相似,但与以下通话相比,第一个通话所花的费用要多于两倍。这是为什么?有没有一种方法可以使第一个查询更快?