grouped = data.groupby('LA_DECH')
start = date(2016, 1, 1)
end = date(2016, 12, 31)
rng = pd.date_range(start, end, freq='BM')
使用此比较是否有一种简单的方法来提取数据(df列表):
'2016/1/1' < grouped['LA_DECH] < '2016/2/29'
以及rng
中每个时段的ERROR Shell: Failed to locate the winutils binary in the hadoop binary path
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
和
答案 0 :(得分:0)
import pandas as pd
import numpy as np
from datetime import datetime
start = datetime(2016,1,1)
end = datetime(2016,12,31)
idx = pd.date_range('2015-01-01','2017-09-01')
df = pd.DataFrame(np.random.randint(10,size= (len(idx),2)), index= idx, columns=['VALUE',"LA_DECH"])
rng = pd.date_range(start, end, freq='BM')
# filted by start and end date
df = df[(df.index>start)&(df.index <end)] # this line is not necessary needed
print(df.groupby([pd.cut(df.index,rng), 'LA_DECH'])['LA_DECH'].count())
LA_DECH
(2016-01-29, 2016-02-29] 0 2
2 1
3 5
4 2
5 3
6 4
7 4
8 5
9 5
(2016-02-29, 2016-03-31] 0 4
2 1
3 4
4 5
5 2
6 3
7 3
8 6
9 3
..
(2016-08-31, 2016-09-30] 8 2
9 1
(2016-11-30, 2016-12-30] 0 2
1 1
2 1
3 1
4 1
5 5
6 3
7 5
8 5
9 6
Name: LA_DECH, Length: 104, dtype: int64