我正在尝试使用pandas数据框的datetimeindex来分配一个名为“季节”的新列。
winter =[12,1,2]
spring =[3,4,5]
summer =[6,7,8]
autumn =[9,10,11]
DTX_index = [datetime(2017, 2, 1).date(), datetime(2017, 3, 1).date(), datetime(2017, 6, 1).date(), datetime(2017, 9, 1).date()]
DTX_index = pd.to_datetime(DTX_index, utc=True)
df = pd.DataFrame(index=DTX_index)
我希望有这样的东西:
season
2017-02-01 00:00:00+00:00 winter
2017-03-01 00:00:00+00:00 spring
2017-06-01 00:00:00+00:00 summer
2017-09-01 00:00:00+00:00 autumn
分配一个月
df['month'] = df.index.month
为单个季节分配布尔值
df['season'] = df.index.month.isin([12,1,2])
我不确定如何根据整个df中的月份分配季节?我尝试了一个Apply函数:
def add_season(x):
if x.index.month.isin([12,1,2]):
return 'winter'
elif x.index.month.isin([3,4,5]):
return 'spring'
elif x.index.month.isin([6,7,8]):
return 'summer'
elif x.index.month.isin([9,10,11]):
return 'autumn'
df['season'] = df.apply(add_season)
但这会返回值错误:
ValueError: ('The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()', 'occurred at index season')
大概是因为该函数在整个序列上而不是在元素上进行运算。
我确定在应用功能方面比我有更多经验的人可以很快解决此问题吗?
非常感谢
答案 0 :(得分:3)
IIUC
d={**dict.fromkeys(winter,'winter'),**dict.fromkeys(spring,'spring'),**dict.fromkeys(summer,'summer'),**dict.fromkeys(autumn,'autumn')}
df['Value']=list(map(d.get,df.index.month))
df
Out[697]:
Value
2017-02-01 00:00:00+00:00 winter
2017-03-01 00:00:00+00:00 spring
2017-06-01 00:00:00+00:00 summer
2017-09-01 00:00:00+00:00 autumn
答案 1 :(得分:2)
您可以创建一个映射框架并使用map
。为了使其正常工作,季节应该包含不同的月份。
u = pd.DataFrame().assign(
winter=winter, spring=spring, summer=summer, autumn=autumn
).melt().set_index('value')
df.assign(month=df.index.month.map(u.variable))
month
2017-02-01 00:00:00+00:00 winter
2017-03-01 00:00:00+00:00 spring
2017-06-01 00:00:00+00:00 summer
2017-09-01 00:00:00+00:00 autumn