我想使用Python取消行标签的名称。
已清除数据以删除总计和na行。
代码如下:
Description | Table | Chair
***Manila*** | |
Apple | 1 | 3
Pair | 0 | 1
Orange | 1 | 0
Watermelon | 0 | 5
Banana | 0 | 7
***Quezon*** | |
DragonFruit | 0 | 0
StarApple | 0 | 0
Longan | 0 | 1
Cherries | 1 | 2
Mango | 0 | 5
表格图片:
我希望代码看起来如何:
Description | Day | Table | Chair
Manila | 1 | 1 | 3
Manila | 2 | 0 | 1
Manila | 3 | 1 | 0
Manila | 4 | 0 | 5
Manila | 5 | 0 | 7
Quezon | 1 | 0 | 0
Quezon | 2 | 0 | 0
Quezon | 3 | 0 | 1
Quezon | 4 | 1 | 2
Quezon | 5 | 0 | 5
答案 0 :(得分:0)
答案 1 :(得分:0)
IIUC,请尝试以下操作:
df = pd.DataFrame({'Description':['Manila',1,2,3,4,5,'Quezon',1,2,3,4,5],
'Table':['',1,0,1,0,0,'',0,0,0,1,0],
'Chair':['',3,1,0,5,7,'',0,0,1,2,5]})
print(df)
输出:
Description Table Chair
0 Manila
1 1 1 3
2 2 0 1
3 3 1 0
4 4 0 5
5 5 0 7
6 Quezon
7 1 0 0
8 2 0 0
9 3 0 1
10 4 1 2
11 5 0 5
仅使用正则表达式从单词中创建新列并向前填充:
df['Group'] = df['Description'].str.extract('(\w+)').ffill()
#Drop those "header records"
df_out = df[df['Description'].str.contains('\w+').isna()]\
.reindex(['Group','Description','Table','Chair'], axis=1)
print(df_out)
输出:
Group Description Table Chair
1 Manila 1 1 3
2 Manila 2 0 1
3 Manila 3 1 0
4 Manila 4 0 5
5 Manila 5 0 7
7 Quezon 1 0 0
8 Quezon 2 0 0
9 Quezon 3 0 1
10 Quezon 4 1 2
11 Quezon 5 0 5
#Another way, look for blanks in table or chairs:
df = pd.DataFrame({'Description':['Manila',1,2,3,4,5,'Quezon',1,2,3,4,5],
'Table':[np.nan,1,0,1,0,0,np.nan,0,0,0,1,0],
'Chair':[np.nan,3,1,0,5,7,np.nan,0,0,1,2,5]})
m = df['Table'].isna()
df['Group'] = df.loc[m, 'Description']
df['Group'] = df['Group'].ffill()
df_out = df.loc[~m].reindex(['Group','Description','Table','Chair'], axis=1)
输出:
Group Description Table Chair
1 Manila 1 1.0 3.0
2 Manila 2 0.0 1.0
3 Manila 3 1.0 0.0
4 Manila 4 0.0 5.0
5 Manila 5 0.0 7.0
7 Quezon 1 0.0 0.0
8 Quezon 2 0.0 0.0
9 Quezon 3 0.0 1.0
10 Quezon 4 1.0 2.0
11 Quezon 5 0.0 5.0