我必须从熊猫的日期时间索引中创建一个分类变量,并为它寻找Python方式。
直到现在,我才循环浏览所有索引,并进行了一堆if-else。我尝试从(Adding a new pandas column with mapped value from a dictionary)中使用lambda if else函数字典,并使用map来创建分类函数,但这没用
date_series = pd.date_range(start = '2010-12-31', end = '2018-12-31', freq = 'M')
regime_splitter = {lambda x : x < '2012' : 'before 2012' , lambda x : x>= '2012' and x < '2014': '2012 - 2014', lambda x : x>= '2014' : 'after 2014'}
date_series.map(regime_splitter)
预期结果
date regime
0 2010-12-31 before 2012
1 2013-05-31 between 2012, 2014
2 2018-12-31 after 2014
答案 0 :(得分:2)
如果需要添加/删除更多组,请使用cut
和DatetimeIndex.year
作为解决方案:
a = pd.cut(date_series.year,
bins=[-np.inf, 2012, 2014, np.inf],
labels=['before 2012','2012 - 2014','after 2014'])
print (a.value_counts())
before 2012 25
2012 - 2014 24
after 2014 48
dtype: int64
使用numpy.select
的另一种解决方案:
x = date_series.year
a = np.select([x <= 2012, x>= 2014], ['before 2012','after 2014'], '2012 - 2014')
print (pd.Series(a).value_counts())
after 2014 60
before 2012 25
2012 - 2014 12
dtype: int64
您应该使用嵌套的if-else
来更改您的解决方案,但是如果数据量很大,它应该会变慢:
regime_splitter = (lambda x: 'before 2012' if x <= 2012 else
('2012 - 2014' if x>= 2012 and x <= 2014 else 'after 2014'))
a = date_series.year.map(regime_splitter)
print (a.value_counts())
after 2014 48
before 2012 25
2012 - 2014 24
dtype: int64
答案 1 :(得分:0)
import pandas as pd
data_series = pd.date_range(start='2010-12-31', end='2018-12-31', freq='M')
df = pd.DataFrame(data_series, columns=['Dates'])
def regime_splitter(value):
if value < pd.to_datetime('2012-01-01'):
return 'before 2012'
elif value > pd.to_datetime('2014-12-31'):
return'After 2014'
else:
return 'Between 2012, 2014'
df['regime_splitter'] = df['Dates'].apply(regime_splitter)
df.head(15)
Dates regime_splitter
0 2010-12-31 before 2012
1 2011-01-31 before 2012
2 2011-02-28 before 2012
3 2011-03-31 before 2012
4 2011-04-30 before 2012
5 2011-05-31 before 2012
6 2011-06-30 before 2012
7 2011-07-31 before 2012
8 2011-08-31 before 2012
9 2011-09-30 before 2012
10 2011-10-31 before 2012
11 2011-11-30 before 2012
12 2011-12-31 before 2012
13 2012-01-31 Between 2012, 2014
14 2012-02-29 Between 2012, 2014