我有以下数据集:
record_id date site sick funny happy
ABCC2922-6 11/5/2018 1 1 1 1
CDEC2924-2 11/3/2018 4 1 1 1
ABCC2925-9 11/4/2018 4 1 1 1
CDEC2927-5 11/3/2018 1 1 1 1
FGHC2929-1 10/31/2018 4 1 1 1
FGHC1724-9 10/25/2018 2 3 1 1
IJKC1726-4 11/2/2018 1 3 1 1
IJKC1728-0 11/2/2018 2 3 1 1
ABCC1730-6 11/2/2018 2 3 1 1
ABCC1731-4 11/2/2018 2 3 1 1
CDEC1733-0 11/2/2018 1 3 1 1
CDEC1735-5 11/2/2018 2 3 1 1
CDEC1912-0 11/20/2018 1 1 1 1
IJKC1914-6 11/2/2018 2 6 1 1
ABCC1916-1 11/2/2018 2 6 1 1
IJKC1918-7 11/2/2018 2 1 1 1
CDEC1920-3 11/2/2018 1 6 1 1
IJKC1941-9 11/24/2018 2 4 1 1
IJKC1943-5 11/2/2018 2 4 1 1
ABCC1945-0 11/2/2018 1 4 1 1
CDEC1947-6 9/2/2018 2 1 1 1
ABCC1949-2 11/2/2018 2 4 1 1
CDEC1951-8 11/2/2018 2 5 1 1
IJKC1953-4 9/29/2018 2 1 1 1
我有下面的代码部分给出了我想要的结果:
df['date'] = pd.to_datetime(df['date'])
m1 = (df['sick'] == 1) | (df['funny'] == 1) | (df['happy'] == 1)
m2 = df['date'] >= pd.Timestamp('today') - pd.DateOffset(days=7)
m3 = ~df['date'].dt.weekday.isin([5, 6])
dates_ocurred = df.loc[m1 & m2 & m3, 'date'].value_counts()
dates_ocurred
2018-11-01 10
2018-11-02 6
2018-10-30 4
2018-10-31 3
Name: date, dtype: int64
places_ocurred = df.loc[m1 & m2 & m3, 'site'].value_counts()
places_ocurred
4 9
3 6
1 5
2 3
Name: site, dtype: int64
那么,我想知道这些计数日期在哪里发生?如下所示: 4 9例,其中1例发生在X天,3例发生在Y天,依此类推。 如何知道这些案件何时发生以及在同一张桌子上发生的地方?
@jpp,您的解决方案还可以,但是如何按站点过滤日期:
site=2
date
count sites
1 2018-11-02 14 [2]
Site=1
date
count sites
1 2018-11-02 14 [1]
2 2018-11-05 1 [1]
3 2018-11-20 1 [1]
Site=3
date
count sites