我想转置下表:
Name | State | Value
~~~~~~~~~~~~~~~~~~~~
nameA | state1 | 1
nameA | state2 | 5
nameA | state1 | 9
nameA | state1 | 2
nameB | state2 | 3
nameB | state1 | 1
进入这样一个表:
Name | range1_state1 |range1_state2 | range2_state1 | range2_state2
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
nameA | 2 | 1 | 0 | 1
nameB | 1 | 0 | 1 | 0
如果range1 = [0,5]并且range2 =(5,10)
第二个表中的数据是出现次数
第一张桌子。
谢谢
答案 0 :(得分:2)
print (pd.cut(df['Value'], bins=[0, 5, 10], include_lowest=True))
0 [0, 5]
1 [0, 5]
2 (5, 10]
3 [0, 5]
4 [0, 5]
5 [0, 5]
Name: Value, dtype: category
Categories (2, object): [[0, 5] < (5, 10]]
df['rng'] = pd.cut(df['Value'], bins=[0, 5, 10],
labels=['range1','range2'], include_lowest=True)
df['State'] = df['rng'].astype(str) + '_' + df['State']
print (df)
Name State Value rng
0 nameA range1_state1 1 range1
1 nameA range1_state2 5 range1
2 nameA range2_state1 9 range2
3 nameA range1_state1 2 range1
4 nameB range1_state2 3 range1
5 nameB range1_state1 1 range1
df = pd.crosstab(df.Name, df.State)
print (df)
State range1_state1 range1_state2 range2_state1
Name
nameA 2 1 1
nameB 1 1 0
编辑:
您可以检查此示例中的分箱值:
df1 = pd.DataFrame({'Value':np.arange(11)})
df1['bins'] = pd.cut(df1['Value'], bins=[0, 5, 10], include_lowest=True)
print (df1)
Value bins
0 0 [0, 5]
1 1 [0, 5]
2 2 [0, 5]
3 3 [0, 5]
4 4 [0, 5]
5 5 [0, 5]
6 6 (5, 10]
7 7 (5, 10]
8 8 (5, 10]
9 9 (5, 10]
10 10 (5, 10]