Question

我正在尝试使用数据库（this one），但是我无法正确地对数据帧进行多索引。目前我有这个：

csv_path='https://ckan0.cf.opendata.inter.prod-toronto.ca/download_resource/ef0239b1-832b-4d0b-a1f3-4153e53b189e?format=csv'
df_profile=pd.read_csv(csv_path,index_col=["Topic","Characteristic"])
#df_profile["Characteristic"]=df_profile.Characteristic.str.strip()
df_profile.drop(["_id","Data Source","Category"],axis=1,inplace=True)
df_profile.replace("No Designation","NaN",inplace=True)
df_profile.head(10)

Result of the code

问题是某些索引不在列内，而是由空格数表示。我的意思是，列内的第一个索引没有空格，然后创建一个分区，在该空格的前面有一个空格，依此类推。

以下是使用df_profile.index.values.tolist（）获得的零件的示例

('Mother tongue',
  'Mother tongue for the total population excluding institutional residents'),
 ('Mother tongue', '  Single responses'),
 ('Mother tongue', '    Official languages'),
 ('Mother tongue', '      English'),
 ('Mother tongue', '      French'),
 ('Mother tongue', '    Non-official languages'),
 ('Mother tongue', '      Aboriginal languages'),
 ('Mother tongue', '        Algonquian languages'),
 ('Mother tongue', '          Blackfoot'),
 ('Mother tongue', '          Cree-Montagnais languages'),
 ('Mother tongue', '          Eastern Algonquian languages'),
 ('Mother tongue', '          Ojibway-Potawatomi languages'),
 ('Mother tongue', '          Algonquian languages, n.i.e.'),
 ('Mother tongue', '        Athabaskan languages'),

是否有任何功能可以根据空格数在数据帧的多索引中转换这种结构？如果没有，最简单的解决方法是什么？创建具有不同选项（无空格，一个空格，两个空格...）的数组列表，并将其添加到索引吗？

感谢您的帮助。

根据空格数创建MultiIndex

0 个答案: