应用错误收集

我正在使用python库Impyla在python脚本中使用Impala从HDFS查询数据。具体数据是代理数据，并且有很多。我有一个脚本，每天运行一次以提取前一天并运行统计信息。目前，我正在为此查询使用devicereceipttime字段，该字段存储为时间戳。

from impala.dbapi import connect
from impala.util import as_pandas
import pandas as pd

#Pull desired features from the proxy_realtime_p table
cursor.execute('select request, count(*) as count \
from default.proxy_realtime_p \
where devicereceipttime BETWEEN concat(to_date(now() - interval 1 days), " 00:00:00") and concat(to_date(now() - interval 1 days), " 23:59:59") \
group by request \
order by count desc')

此查询需要一点时间，如果可能，希望加快速度。从下面的给定字段中，我的查询最有效吗？

devicereceipttime (timestamp)
year (int)
month (int)
day (int)
hour (int)
minute (int)
seconds (int)

使用带concat（to_date）的时间戳记字段是在Impala中查询前一天的最有效方法吗？

0 个答案: