所以我正在开发一个允许用户选择季节性时间段的功能,并且它可以工作,但我想允许一些额外的功能。
现在我允许用户指定月度周期,并且该函数返回一个新数据框,其中包含指定月份的数据(请参阅代码和示例)。我想要的是允许用户选择开始的月份和日期(例如1月5日 - 3月18日)并且仅选择该范围内的日期。那可能吗?
我的代码如下:
import numpy as np
import pandas as pd
def seasonal_period(merged_dataframe, period):
"""Returns the seasonal period specified for the time series"""
start = period[0]
end = period[1]
merged_dataframe = merged_dataframe.loc[(merged_dataframe.index.month >= start) &
(merged_dataframe.index.month <= end)]
return merged_dataframe
# Testing the seasonal period with random data
df = pd.DataFrame(np.random.rand(10000, 3), index=pd.date_range('1/1/1980', periods=10000, freq='D'))
# Returns data between Jan and May
print(seasonal_period(merged_dataframe=df, period=[1, 5]))
打印:
0 1 2
1980-01-01 0.788608 0.113614 0.328662
1980-01-02 0.208422 0.974086 0.765795
1980-01-03 0.448420 0.004947 0.184313
1980-01-04 0.400208 0.194078 0.961875
1980-01-05 0.118263 0.406548 0.358848
1980-01-06 0.824994 0.969560 0.892299
1980-01-07 0.140431 0.642784 0.961061
1980-01-08 0.235443 0.236711 0.291453
1980-01-09 0.420899 0.083092 0.277860
1980-01-10 0.185541 0.640260 0.161851
1980-01-11 0.654466 0.742445 0.398733
1980-01-12 0.270931 0.500233 0.121283
1980-01-13 0.590752 0.057112 0.477629
1980-01-14 0.122973 0.997112 0.998513
1980-01-15 0.330342 0.175655 0.240798
1980-01-16 0.559489 0.426027 0.135564
1980-01-17 0.260714 0.493863 0.420336
1980-01-18 0.214587 0.890858 0.097045
1980-01-19 0.243018 0.285315 0.112326
1980-01-20 0.334157 0.630524 0.585468
1980-01-21 0.974340 0.023412 0.349269
1980-01-22 0.435924 0.709390 0.554518
1980-01-23 0.158202 0.288950 0.747733
1980-01-24 0.855350 0.066325 0.796400
1980-01-25 0.482685 0.962369 0.948844
1980-01-26 0.605162 0.185115 0.832465
1980-01-27 0.078977 0.886044 0.823400
1980-01-28 0.062488 0.841581 0.998819
1980-01-29 0.070578 0.836261 0.732075
1980-01-30 0.386692 0.413445 0.524926
... ... ... ...
2007-04-19 0.030180 0.295753 0.696634
2007-04-20 0.246591 0.245117 0.096647
2007-04-21 0.915289 0.264874 0.754863
2007-04-22 0.222286 0.041275 0.922791
2007-04-23 0.389606 0.149993 0.200387
2007-04-24 0.113636 0.923970 0.031243
2007-04-25 0.154459 0.587656 0.508116
2007-04-26 0.525778 0.056525 0.380457
2007-04-27 0.335463 0.343321 0.191828
2007-04-28 0.249183 0.361834 0.327324
2007-04-29 0.994158 0.108749 0.375496
2007-04-30 0.674535 0.527557 0.744897
2007-05-01 0.029355 0.227039 0.418219
2007-05-02 0.946061 0.251699 0.002965
2007-05-03 0.127731 0.479151 0.634638
2007-05-04 0.045522 0.800802 0.170384
2007-05-05 0.514632 0.426107 0.557497
2007-05-06 0.974910 0.757357 0.119415
2007-05-07 0.624626 0.287442 0.211390
2007-05-08 0.408227 0.720328 0.400762
2007-05-09 0.981552 0.399663 0.953638
2007-05-10 0.256625 0.301236 0.832127
2007-05-11 0.513227 0.649790 0.174498
2007-05-12 0.229353 0.089870 0.024055
2007-05-13 0.819985 0.470549 0.388860
2007-05-14 0.640930 0.530929 0.694122
2007-05-15 0.065560 0.084560 0.677467
2007-05-16 0.297165 0.949761 0.483062
2007-05-17 0.405513 0.320957 0.678885
2007-05-18 0.315292 0.773871 0.043010
[4222 rows x 3 columns]
Process finished with exit code 0
有什么建议吗?
答案 0 :(得分:1)
您可以尝试稍微重写您的功能:
def seasonal_period(merged_dataframe, period):
"""Returns the seasonal period specified for the time series"""
start = period[0]
end = period[1]
merged_dataframe = merged_dataframe.loc[(merged_dataframe.index >= start) &
(merged_dataframe.index <= end)]#[([merged_dataframe.index['1980-01-17':'1980-01-20']])]
return merged_dataframe
修改索引
df = pd.DataFrame(np.random.rand(10000, 3),
index=pd.date_range('1/1/1980', periods=10000, freq='D'))
df.index = df.index.strftime('%m-%d')
然后打印
print(seasonal_period(merged_dataframe=df, period=['05-30', '06-02']))
它打印以下内容:
0 1 2
05-30 0.506990 0.000789 0.879022
05-31 0.521576 0.812470 0.882075
06-01 0.911531 0.158134 0.943459
06-02 0.072259 0.254357 0.066428
05-30 0.060392 0.911165 0.692112
05-31 0.318079 0.379530 0.924417
06-01 0.095082 0.864511 0.967509
06-02 0.899394 0.081380 0.422184
... ... ... ...
06-02 0.460351 0.937928 0.302218
05-30 0.151066 0.908212 0.039089
05-31 0.322693 0.056857 0.375615
06-01 0.851227 0.023046 0.897951
06-02 0.876524 0.006360 0.181202
[108 rows x 3 columns]