这是所需的数据框:
这是我正在阅读的csv:
这是我的代码:
import pandas as pd
df = pd.read_csv('Holidays.csv')
#print(df.head())
df = df.transpose()
print(df)
这是csv:
State Official Leaves
Michigain 28-01-2019
Texas 30-01-2019
Florida 05-02-2019
Hawaii 04-07-2019
Arizona 04-07-2019
North Carolina 04-07-2019
Illinois 04-07-2019
Ohio 04-07-2019
Georgia 04-07-2019
Michigain 04-07-2019
Texas 04-07-2019
Florida 04-07-2019
California 04-07-2019
答案 0 :(得分:2)
考虑到df
的样子,我拍摄了一个示例数据框(自从您提供了图像):
print(df)
States Official Leaves
0 Michigan 2019-01-28
1 Texas 2019-01-30
2 Florida 2019-02-05
3 Hawaii 2019-07-04
添加一列以表示日期和月份的字符串,并使用pd.crosstab()
df['day_month']=df['Official Leaves'].dt.strftime('%b-%d')
pd.crosstab(df.States,df.day_month).astype(bool).reset_index().rename_axis(None,1)
#if you want states as index, just remove the reset_index() from the code
States Feb-05 Jan-28 Jan-30 Jul-04
0 Florida True False False False
1 Hawaii False False False True
2 Michigan False True False False
3 Texas False False True False
注意:如果Official leaves
列的dtype是object,请使用datetime
将其转换为df['Official Leaves']=pd.to_datetime(df['Official Leaves'])
答案 1 :(得分:1)
在一(长)行中
df = df.pivot(index='State', columns='Official Leaves', values='Official Leaves') \
.fillna(False) \
.applymap(lambda x: True if x else False)
要将列名称更改为该日期格式
df.columns = pd.to_datetime(df.columns) \
.to_series() \
.apply(lambda x: x.strftime('%b-%d'))