我有一个数据框,如下所示:
import pandas as pd
data = {
'Num' : ['1','2', '3','4','5','6','7'],
'col1': ['val1', 'val6', 'val3', 'val7', 'val2','val4','val5'],
'col2': ['','val3','val5','','','',''],
'col3': ['','val1','val2','','','','']
}
df = pd.DataFrame(data)
df["myvals"]=1
Num col1 col2 col3 myvals
0 1 val1 1
1 2 val6 val3 val1 1
2 3 val3 val5 val2 1
3 4 val7 1
4 5 val2 1
5 6 val4 1
6 7 val5 1
我正在尝试将'col1','col2'和'col3'中的值转换为同一组'pivot列',但到目前为止,我只能捕获值来自“ col1”:
pd.pivot_table(df, values="myvals", index=["Num"], columns="col1", fill_value=0)
col1 val1 val2 val3 val4 val5 val6 val7
Num
1 1 0 0 0 0 0 0
2 0 0 0 0 0 1 0
3 0 0 1 0 0 0 0
4 0 0 0 0 0 0 1
5 0 1 0 0 0 0 0
6 0 0 0 1 0 0 0
7 0 0 0 0 1 0 0
关于如何也将'col2'和'col3'的值引入下面的任何想法如下所示,其中'Num'= 2和'Num'= 3的行应具有多个1?
col1 val1 val2 val3 val4 val5 val6 val7
Num
1 1 0 0 0 0 0 0
2 1 0 1 0 0 1 0
3 0 1 1 0 1 0 0
4 0 0 0 0 0 0 1
5 0 1 0 0 0 0 0
6 0 0 0 1 0 0 0
7 0 0 0 0 1 0 0
答案 0 :(得分:1)
这更像是get_dummies
问题
df.replace('',np.nan).set_index('Num').stack().str.get_dummies().sum(level=0)
Out[1125]:
val1 val2 val3 val4 val5 val6 val7
Num
1 1 0 0 0 0 0 0
2 1 0 1 0 0 1 0
3 0 1 1 0 1 0 0
4 0 0 0 0 0 0 1
5 0 1 0 0 0 0 0
6 0 0 0 1 0 0 0
7 0 0 0 0 1 0 0