我在df中有这个专栏:
Column A
--------
x-y: 1
x-y: 2
x-y: 2
x-x: 1
y-x: 2
y-y: 3
y-y: 3
是否有可能将它们分解为这样的矩阵?
1 2 3 *based on the range of number of column A
--------------
x-x 1 0 0 because there's 1 'x-x: 1'
x-y 1 2 0 because there's 1 'x-y: 1' and 2 'x-y: 2'
y-x 0 1 0 because there's 1 'x-y: 2'
y-y 0 0 2 because there's 2 'y-y: 3'
谢谢!
答案 0 :(得分:2)
您可以将reset_index
与groupby
一起使用,然后按size
获取计数并按unstack
重新塑造:
print (df)
Column A
x-y 1
x-y 2
x-y 2
x-x 1
y-x 2
y-y 3
y-y 3
print (df.reset_index())
index Column A
0 x-y 1
1 x-y 2
2 x-y 2
3 x-x 1
4 y-x 2
5 y-y 3
6 y-y 3
df = df.reset_index().groupby(['index','Column A']).size().unstack(fill_value=0)
print (df)
Column A 1 2 3
index
x-x 1 0 0
x-y 1 2 0
y-x 0 1 0
y-y 0 0 2
crosstab
的另一个解决方案:
df = pd.crosstab(df.index, df['Column A'])
print (df)
Column A 1 2 3
row_0
x-x 1 0 0
x-y 1 2 0
y-x 0 1 0
y-y 0 0 2
如果有必要拆分:
print (df)
Column A
0 x-y: 1
1 x-y: 2
2 x-y: 2
3 x-x: 1
4 y-x: 2
5 y-y: 3
6 y-y: 3
df[['a','b']] = df['Column A'].str.split(':\s+', expand=True)
print (df)
Column A a b
0 x-y: 1 x-y 1
1 x-y: 2 x-y 2
2 x-y: 2 x-y 2
3 x-x: 1 x-x 1
4 y-x: 2 y-x 2
5 y-y: 3 y-y 3
6 y-y: 3 y-y 3
df = df.groupby(['a','b']).size().unstack(fill_value=0)
print (df)
b 1 2 3
a
x-x 1 0 0
x-y 1 2 0
y-x 0 1 0
y-y 0 0 2