我正在寻找优雅,Pythonic的方式使Pandas DataFrame列保持一致。含义:
我有以下示例,但有没有内置的Pandas方法来实现相同的目标?
import pandas as pd
df1 = pd.DataFrame(data=[{'a':1,'b':32, 'c':32}])
print df1
a b c 0 1 32 32
column_master_list = ['b', 'c', 'e', 'd', 'a']
def get_dataframe_with_consistent_header(df, headers):
for col in headers:
if col not in df.columns:
df[col] = pd.np.NaN
return df[headers]
print get_dataframe_with_consistent_header(df1, column_master_list)
b c e d a 0 32 32 NaN NaN 1
答案 0 :(得分:4)
您可以使用reindex_axis
方法。传入列名列表并指定'columns'
。缺省条目的填充值默认为NaN
:
>>> df1.reindex_axis(column_master_list, 'columns')
b c e d a
0 32 32 NaN NaN 1