我有动态项目名称,所以我想要代码:
这是我目前正在进行的代码路由,但我不确定是否有更简化的方法来创建它而不是创建一个空的数据帧并尝试将数据填充到其中?欢迎任何建议,谢谢!
示例df:
Name Date Item Minutes
Dave 10-02-2017 item1 3
Dave 10-02-2017 item2 5
Joe 10-02-2017 item3 2
Dave 10-02-2017 item2 1
Dave 10-02-2017 item2 2
Marcia 10-02-2017 item1 5
Amy 10-02-2017 item2 3
代码:
#find unique values in df column
unique_df = pd.DataFrame(df['Item'].unique())
#number length of unique rows
unique_df_len = len(unique_df)
#create empty dataframe using unique number of items discovered
new_df = pd.DataFrame([(0,)*unique_df_len])
#replace columns headings with unique row value names
new_df.columns = unique_df.iloc[:,0]
#loop through empty dataframe column headings
for column_name in list(new1):
#loop through df looking for each item name
for index, row in df.iterrows(): df['Item'] = df.lookup(df.index,df[column_name])
这就是我被困住的地方....上面的第二个循环不起作用。
期望输出:
Name Date item1 item2 item3 total minutes
Dave 10-02-2017 1 3 0 11
Joe 10-02-2017 0 0 1 2
Marcia 10-02-2017 1 0 0 5
Amy 10-02-2017 0 1 0 3
答案 0 :(得分:3)
简单pivot_table
total=df.groupby(['Name','Date']).Minutes.sum()
df=pd.pivot_table(df,index=['Name','Date'],columns='Item',values='Minutes',aggfunc=len,fill_value=0)
Out[1070]:
Item item1 item2 item3
Name Date
Amy 10-02-2017 0 1 0
Dave 10-02-2017 1 3 0
Joe 10-02-2017 0 0 1
Marcia 10-02-2017 1 0 0
df['total minutes']=total
df.reset_index()
Out[1111]:
Item Name Date item1 item2 item3 total minutes
0 Amy 10-02-2017 0 1 0 3
1 Dave 10-02-2017 1 3 0 11
2 Joe 10-02-2017 0 0 1 2
3 Marcia 10-02-2017 1 0 0 5
或者您可以使用crosstab
获取count
df=pd.crosstab(index=[df['Name'],df['Date']],columns=df['Item'])
df
Out[1093]:
Item item1 item2 item3
Name Date
Amy 10-02-2017 0 1 0
Dave 10-02-2017 1 3 0
Joe 10-02-2017 0 0 1
Marcia 10-02-2017 1 0 0