我正在尝试分析一个完整的数据集,并且对如何通过熊猫修复数据感到困惑。数据集如下所示:
我正在尝试使其看起来像这样:
April 2 | April 3 | April 4
unique_tests total unique tests for april 2 | total unique tests for april 3|total unique tests for april 4
positive total positive for april 2 | total positive for april 3 |total positive for april 4
negative total negative for april 2 | total negative for april 3 |total negative for april 4
remaining total remaining for april 2 | total remaining for april 3 |total remaining for april 4
我约会的日期是4月24日。
关于如何实现此目标的任何想法?我无法使其与Pandas中的数据透视表一起使用
答案 0 :(得分:1)
使用:
#convert columns to numeric and date to datetimes
df = pd.read_csv(file, thousands=',', parse_dates=['date'])
#create custom format of datetimes and aggregate sum, last transpose
df1 = df.groupby(df['date'].dt.strftime('%d-%b')).sum().T
或者可以重新分配用新的日期时间格式填充的列date
:
df1 = df.assign(date = df['date'].dt.strftime('%d-%b')).groupby('date').sum().T