我有一个我已经转动过的数据框:
FinancialYear 2014/2015 2015/2016 2016/2017 2017/2018
Month
April 42 32 29 27
August 34 28 32 0
December 45 51 28 0
February 28 20 28 0
January 32 28 33 0
July 40 66 31 30
June 32 67 37 35
March 43 36 39 0
May 34 30 24 29
November 39 32 31 0
October 38 39 28 0
September 29 19 34 0
这是我使用的代码:
new_hm01 = hmdf[['FinancialYear','Month','FirstReceivedDate']]
hm05 = new_hm01.pivot_table(index=['FinancialYear','Month'], aggfunc='count')
df_hm = new_hm01.groupby(['Month', 'FinancialYear']).size().unstack(fill_value=0).rename(columns=lambda x: '{}'.format(x))
月份不是我想要的顺序,所以我使用以下代码根据列表重新编制索引:
vals = ['April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December', 'January', 'February', 'March']
df_hm = df_hm.reindex(vals)
这很有效,但我表中的值现在大多显示NaN
值。
FinancialYear 2014/2015 2015/2016 2016/2017 2017/2018
Month
April nan nan nan nan
May nan nan nan nan
June nan nan nan nan
July nan nan nan nan
August nan nan nan nan
September 29 19 34 0
October nan nan nan nan
November nan nan nan nan
December nan nan nan nan
January nan nan nan nan
February nan nan nan nan
March nan nan nan nan
对发生的事情有什么看法?怎么解决?如果有更好的替代方法?
答案 0 :(得分:4)
重建索引后的意外NaN通常是由于新索引标签与旧索引标签不完全匹配。例如,如果原始索引标签包含空格,但新标签不包含空格,那么您将获得NaN:
import numpy as np
import pandas as pd
df = pd.DataFrame({'col':[1,2,3]}, index=['April ', 'June ', 'May ', ])
print(df)
# col
# April 1
# June 2
# May 3
df2 = df.reindex(['April', 'May', 'June'])
print(df2)
# col
# April NaN
# May NaN
# June NaN
可以通过删除空格来修复此问题,以使标签匹配:
df.index = df.index.str.strip()
df3 = df.reindex(['April', 'May', 'June'])
print(df3)
# col
# April 1
# May 3
# June 2