设置向上
我有一个由多列组成的pandas数据帧df
,标题如
| id | x, single room | x, double room | y, single room | y, double room |
--------------------------------------------------------------------------
⋮ ⋮ ⋮ ⋮ ⋮
<小时/> 的问题
我想按照以下方式对以x
开头并以标题下的y
开头的列进行分组,
| x | y |
--------------------------------------------------------------
| id | single room | double room | single room | double room |
--------------------------------------------------------------
⋮ ⋮ ⋮ ⋮ ⋮
我该怎么办?
答案 0 :(得分:3)
您可以使用split
,但主要问题是让id
达到最后一级:
col =['id','x, single room','x, double room','y, single room','y, double room' ]
df = pd.DataFrame([[1,1,1,1,1]], columns=col)
print (df)
id x, single room x, double room y, single room y, double room
0 1 1 1 1 1
#create tuples from MultiIndex
a = df.columns.str.split(', ', expand=True).values
print (a)
[('id', nan) ('x', 'single room') ('x', 'double room') ('y', 'single room')
('y', 'double room')]
#swap values in NaN and replace NAN to ''
df.columns = pd.MultiIndex.from_tuples([('', x[0]) if pd.isnull(x[1]) else x for x in a])
print (df)
x y
id single room double room single room double room
0 1 1 1 1 1
旧解决方案:
a = pd.DataFrame(df.columns.str.rsplit(', ', expand=True).values.tolist())
mask = a[1].isnull()
a.loc[mask, [0,1]] = a.loc[mask, [1,0]].values
a[0] = a[0].fillna('')
df.columns = a.set_index([0,1]).index
df.columns.names = ('', '')