出于SO的目的,这是一些组合时间序列数据:
import pandas as pd
import numpy as np
from numpy.random import randint
np.random.seed(10) # added for reproductibility
rng = pd.date_range('10/9/2018 00:00', periods=1000, freq='1H')
df = pd.DataFrame({'Random_Number':randint(1, 100, 1000)}, index=rng)
问题,我该如何创建一个函数来返回熊猫数据框中每天每天的重新采样的97.5和2.5个百分位数值?我知道下面的代码甚至没有关闭,只会返回整个数据集的上下百分比。最终,我试图每天进行细分,返回返回的数据框的索引将是重新采样当天的时间戳(日期)。
def createDfs(data):
for day in df:
dfDay = pd.DataFrame()
hi = df.quantile(0.975)[0]
low = df.quantile(0.025)[0]
data = {'upper_97.5%': [hi],
'lower_2.5%' : [low]}
dfUpperLower = pd.DataFrame(data)
#dfUpperLower.set_index('Date')
return dfUpperLower
任何提示都将不胜感激..
答案 0 :(得分:2)
我认为您只想将+------+--------------------+-------+------+---------------+------+---------+-----------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+--------------------+-------+------+---------------+------+---------+-----------------+------+-------------+
| 1 | PRIMARY | A | ALL | NULL | NULL | NULL | NULL | 5 | |
| 3 | DEPENDENT SUBQUERY | C | ref | a | a | 4 | A.id | 1 | Using index |
| 2 | DEPENDENT SUBQUERY | B | ref | a | a | 4 | A.id | 1 | Using index |
+------+--------------------+-------+------+---------------+------+---------+-----------------+------+-------------+
与.resample
一起使用:
.quantile