当系统出现在“类型”列中时,我想删除该行中的所有值,但“名称”列中的值除外。当“类型”列中出现“硬件”时,我要删除该行中除“颜色”列中的值以外的所有值。 之后,我想将“文本”列中所有不为空的单元格拆分为多行,并保留该列中为空的行。
这是我拥有的数据框:
df
Type Text Name ID Color
System aca\nmaca\nstream\nphase\n Gary 123 Red
System aca\nmaca\nstream\nphase\n Mary 3254 Yellow
Hardware a\nmaca\nstream\nphase\n Jerry 158 White
Software ca\nmaca\nstream\nphase\n Perry 56414 Green
Software aca\nmac\nstream\nphase\n Jimmy 548 Blue
System aca\nmaca\nstream\nphase\n Marc 5658 Black
System aca\nmaca\nstram\npha\n John 867 Pink
Hardware aca\nma\nstream\nphase\n Sam 665 Gray
Hardware aca\nmaca\nstream\nphase\n Jury 5784 Azure
System aca\nmaca\nstream\nphase\n Larry 5589 Fawn
Software aca\nmaca\nst\nphase\n James 6568 Magenta
System aca\nmaca\nstream\nph\n Kevin 568 Cyan
这是预期的结果:
Type Text Name ID Color
System Gary
System Mary
Hardware White
Software ca Perry 56414 Green
Software maca Perry 56414 Green
Software stream Perry 56414 Green
Software phase Perry 56414 Green
Software aca Jimmy 548 Blue
Software mac Jimmy 548 Blue
Software stream Jimmy 548 Blue
Software phase Jimmy 548 Blue
System Marc
System John
Hardware Gray
Hardware Azure
System Larry
Software aca James 6568 Magenta
Software maca James 6568 Magenta
Software st James 6568 Magenta
Software phase James 6568 Magenta
System Kevin
对于将单元格拆分为多行,我尝试了以下功能:
def SepInRows(df, c):
s = df[c].str.split('\n', expand=True).stack()
i = s.index.get_level_values(0)
df2 = df.loc[i].copy()
df2[c] = s.values
return df2
但是它在“文本”列中删除了带有空值的行,这不是我想要的。
该如何解决?
答案 0 :(得分:1)
您可以在预处理中将mask
与difference
一起使用,然后this solution:
c1 = df.columns.difference(['Type','Name'])
c2 = df.columns.difference(['Type','Color'])
df[c1] = df[c1].mask(df['Type'] == 'System', np.nan)
df[c2] = df[c2].mask(df['Type'] == 'Hardware', np.nan)
cols = df.columns
df1 = (df.join(df.pop('Text').str.split('\n', expand=True)
.stack()
.reset_index(level=1, drop=True)
.rename('Text'))
).reset_index(drop=True).reindex(columns=cols)
print (df1)
Type Text Name ID Color
0 System NaN Gary NaN NaN
1 System NaN Mary NaN NaN
2 Hardware NaN NaN NaN White
3 Software ca Perry 56414.0 Green
4 Software maca Perry 56414.0 Green
5 Software stream Perry 56414.0 Green
6 Software phase Perry 56414.0 Green
7 Software Perry 56414.0 Green
8 Software aca Jimmy 548.0 Blue
9 Software mac Jimmy 548.0 Blue
10 Software stream Jimmy 548.0 Blue
11 Software phase Jimmy 548.0 Blue
12 Software Jimmy 548.0 Blue
13 System NaN Marc NaN NaN
14 System NaN John NaN NaN
15 Hardware NaN NaN NaN Gray
16 Hardware NaN NaN NaN Azure
17 System NaN Larry NaN NaN
18 Software aca James 6568.0 Magenta
19 Software maca James 6568.0 Magenta
20 Software st James 6568.0 Magenta
21 Software phase James 6568.0 Magenta
22 Software James 6568.0 Magenta
23 System NaN Kevin NaN NaN