在查询数据库之前四舍五入到最接近的十分之一

时间:2019-01-05 17:55:57

标签: python timestamp influxdb querying influxdb-python

我将InfluxDB用作时间序列数据库。使用这种类型的基础架构非常好。但是,我遇到了一个烦人的问题,我不知道该如何解决。当数据库的精度低于第二精度时,似乎很难查询它,因为时间变化有些偏斜。最初,我要求以0.5秒的精度进行写入,但是我在数据库中没有得到确切的精度。

> select price from TSLA_0p5s limit 100
name: TSLA_0p5s
time                midprice
----                --------
2015-07-15T09:00:00Z        267.1
2015-07-15T09:00:00.499500032Z  267.1
2015-07-15T09:00:01Z        267.1
2015-07-15T09:00:01.499500032Z  267.1
2015-07-15T09:00:02Z        267.1
2015-07-15T09:00:02.499500032Z  267.1
2015-07-15T09:00:03Z        267.1
2015-07-15T09:00:03.499500032Z  267.1
2015-07-15T09:00:04Z        267.1
2015-07-15T09:00:04.499500032Z  267.1
2015-07-15T09:00:05Z        267.1
2015-07-15T09:00:05.499500032Z  267.1
2015-07-15T09:00:06Z        267.1
2015-07-15T09:00:06.499500032Z  267.1
2015-07-15T09:00:07Z        267.1
2015-07-15T09:00:07.499500032Z  267.1
2015-07-15T09:00:08Z        267.1
2015-07-15T09:00:08.499500032Z  267.1
2015-07-15T09:00:09Z        267.1
2015-07-15T09:00:09.499500032Z  267.1
2015-07-15T09:00:10Z        267.1
2015-07-15T09:00:10.499500032Z  267.1
2015-07-15T09:00:11Z        267.1
2015-07-15T09:00:11.499500032Z  267.1
2015-07-15T09:00:12Z        267.1
2015-07-15T09:00:12.499500032Z  267.1

在上面的数据库示例中,您可以看到时间戳之间的变化不是规则的。当我使用influxdb-python将数据写入数据库时​​,timedelta是唯一的,并且设置为0.5秒。在这里您可以注意到

  

2015-07-15T09:00:00.499500032Z-2015-07-15T09:00:00Z = 0.49950032秒(**)

  

2015-07-15T09:00:01Z-2015-07-15T09:00:00.499500032Z = 0.50049968   秒(***)

from influxdb import InfluxDBClient

client = InfluxDBClient("localhost", 8086, username, password, "data")

delta_intraday = timedelta(seconds=0.5)
current_time = datetime.datetime.strptime(start_time, "%Y-%m-%d %H:%M:%S")
next_time = current_time + delta_intraday

start_time = datetime.datetime.strptime(start_time, "%Y-%m-%d %H:%M:%S")
end_time = datetime.datetime.strptime(end_time, "%Y-%m-%d %H:%M:%S")

def generateDataFromDb():
    while next_time < end_time:

        fetch_items = client.query(
            "select * from "
            + DB_NAME
            + " WHERE time >= '"
            + current_time.isoformat().replace("T", " ")
            + "' AND time <= '"
            + next_time.isoformat().replace("T", " ")
            + "';"
        )
        fetch_points = fetch_items.get_points()

        data = []
        data.extend(ts_fetch_items_gen(fetch_points))

        data = np.array(data)  
        data = ts_extract(data, keys)

        yield np.array(data)

        current_time = next_time
        next_time = next_time + delta_intraday

dataGenerator = generateDataFromDb(FROM_DAY, TO_DAY, delta=1)

for i, data in enumerate(dataGenerator):
    print("{}- The datashape is {}".format(i, data.shape))

这里不用理会ts_fetch_items_gen()ts_extract()

以上代码的输出为

0- The datashape is (2, 103)
1- The datashape is (103,)
2- The datashape is (2, 103)
3- The datashape is (103,)
4- The datashape is (2, 103)
5- The datashape is (103,)
6- The datashape is (2, 103)
7- The datashape is (103,)
8- The datashape is (2, 103)
9- The datashape is (103,)
10- The datashape is (2, 103)
11- The datashape is (103,)
12- The datashape is (2, 103)
13- The datashape is (103,)
14- The datashape is (2, 103)
15- The datashape is (103,)
16- The datashape is (2, 103)
17- The datashape is (103,)
18- The datashape is (2, 103)
19- The datashape is (103,)
...

由于(**)(***),我在上面的输出中得到了两个不同的数据形状,即(103,)(2, 103)

在查询数据库之前,是否可以将时间戳四舍五入到最接近的十分之一,即0.49950032 --> 0.5

0 个答案:

没有答案