I have the code below:
import pandas as pd
import datetime
df=pd.read_csv("https://www.dropbox.com/s/08kuxi50d0xqnfc/demo.csv?dl=1")
df["date"]=pd.to_datetime(df["date"])
df['date'] = df.date.apply(lambda x: datetime.datetime.strftime(x,'%b')) # SHOWS date as MONTH
pvt_enroll=df.pivot_table(index='site', columns="date", values = 'baseline', aggfunc = {'baseline' : 'count'}, fill_value=0, margins=True) # Pivot_Table with enrollment by SITE by MONTH
pvt_enroll.to_csv("pivot_test.csv")
table_enroll_site_month = pd.read_csv('pivot_test.csv', encoding='latin-1')
table_enroll_site_month.rename(columns={'site':'Study Site'}, inplace=True)
table_enroll_site_month
Study Site Apr Jul Jun May All
0 A 5.0 0.0 8.0 4.0 17.0
1 B 9.0 0.0 11.0 5.0 25.0
2 C 6.0 1.0 3.0 20.0 30.0
3 D 5.0 0.0 3.0 2.0 10.0
4 E 5.0 0.0 5.0 0.0 10.0
5 All 30.0 1.0 30.0 31.0 92.0
And wonder how to: 1. Display months with year as Apr16 Jul16 Jun16 May16 2. Is it possible to get same table without running this step (pvt_enroll.to_csv("pivot_test.csv")? I mean, can I get same result without needing to save to .csv file first?
答案 0 :(得分:1)
I think by using %b%y
you can get 'Apr16' etc format.
I tried with the following code, without saving into .csv.
import pandas as pd
from datetime import datetime
df=pd.read_csv("demo.csv")
df["date"]=pd.to_datetime(df["date"])
df['date'] = df['date'].apply(lambda x: datetime.strftime(x,'%b%y'))
pvt_enroll=df.pivot_table(index='site', columns="date", values = 'baseline', aggfunc = {'baseline' : 'count'}, fill_value=0, margins=True) # Pivot_Table with enrollment by SITE by MONTH
pvt_enroll.reset_index(inplace=True)
pvt_enroll.rename(columns={'site':'Study Site'}, inplace=True)
print(pvt_enroll)
And I got the output as follows
date Study Site Apr16 Jul16 Jun16 May16 All
0 A 5 0 8 4 17
1 B 9 0 11 5 25
2 C 6 1 3 20 30
3 D 5 0 3 2 10
4 E 5 0 5 0 10
5 All 30 1 30 31 92