我有数据框 看起来像这样
Date Player Fee
0 2017-01-08 Steven Berghuis 6500000
1 2017-07-18 Jerry St. Juste 4500000
2 2017-07-18 Ridgeciano Haps 600000
3 2017-01-07 Sofyan Amrabat 400000
我想将每个日期值更改为str,如果它们符合条件
def is_in_range(x):
ses1 = pd.to_datetime('2013-02-01')
ses2 = pd.to_datetime('2014-02-01')
ses3 = pd.to_datetime('2015-02-01')
ses4 = pd.to_datetime('2016-02-01')
ses5 = pd.to_datetime('2017-02-01')
ses6 = pd.to_datetime('2018-02-01')
if x < ses1 :
x = '2012-13'
if x > ses2 and x < ses3 :
x = '2013-14'
if x > ses3 and x < ses4 :
x = '2014-15'
if x > ses4 and x < ses5 :
x = '2015-16'
if x > ses5 and x < ses6 :
x = '2016-17'
return ses6
aj = ajax_t['Date'].apply(is_in_range)
aj
TypeError Traceback(最近一次调用最后一次) in() 18 x =&#39; 2016-17&#39; 19返回ses6 ---&GT; 20 aj = ajax_t [&#39;日期&#39;]。apply(is_in_range) 21 aj
/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pandas/core/series.py in apply(self,func,convert_dtype,args,** kwds)2353
否则:2354个值= self.asobject - &GT; 2355 mapped = lib.map_infer(values,f,convert = convert_dtype)2356 2357 if len(mapped)and isinstance(映射[0],系列):pandas._libs.lib.map_infer中的pandas / _libs / src / inference.pyx (熊猫/ _libs / lib.c:66645)()
is_in_range(x)中的15如果x> ses4和x&lt; ses5: 16 x =&#39; 2015-16&#39; ---&GT; 17如果x> ses5和x&lt; ses6: 18 x =&#39; 2016-17&#39; 19返回ses6
pandas._libs.tslib._Timestamp中的pandas / _libs / tslib.pyx。 richcmp (熊猫/ _libs / tslib.c:20281)()
TypeError:无法比较类型&#39;时间戳&#39;使用类型&#39; str&#39;
我得到这个错误的任何建议, 麻烦
答案 0 :(得分:1)
如有必要,您需要转换为to_datetime
列,并将变量x
更改为另一个y
,因为它会在循环中被覆盖。
还应该从函数返回变量y
:
ajax_t['Date'] = pd.to_datetime(ajax_t['Date'])
def is_in_range(x):
print (x)
ses1 = pd.to_datetime('2013-02-01')
ses2 = pd.to_datetime('2014-02-01')
ses3 = pd.to_datetime('2015-02-01')
ses4 = pd.to_datetime('2016-02-01')
ses5 = pd.to_datetime('2017-02-01')
ses6 = pd.to_datetime('2018-02-01')
if x < ses1 :
y = '2012-13'
if x > ses2 and x < ses3 :
y = '2013-14'
if x > ses3 and x < ses4 :
y = '2014-15'
if x > ses4 and x < ses5 :
y = '2015-16'
if x > ses5 and x < ses6 :
y = '2016-17'
return y
aj = ajax_t['Date'].apply(is_in_range)
print (aj)
0 2015-16
1 2016-17
2 2016-17
3 2015-16
Name: Date, dtype: object
答案 1 :(得分:1)
使用pd.cut
ses1 = pd.to_datetime('2013-02-01')
ses2 = pd.to_datetime('2014-02-01')
ses3 = pd.to_datetime('2015-02-01')
ses4 = pd.to_datetime('2016-02-01')
ses5 = pd.to_datetime('2017-02-01')
ses6 = pd.to_datetime('2018-02-01')
pd.cut(df.Date,[ses1,ses2,ses3,ses4,ses5,ses6],labels=['2012-13','2013-14','2014-15','2015-16','2016-17'])
Out[1227]:
0 2015-16
1 2016-17
2 2016-17
3 2015-16
Name: Date, dtype: category
或
ses = pd.to_datetime(['2013-02-01','2014-02-01','2015-02-01','2016-02-01','2017-02-01','2018-02-01'])
pd.cut(df.Date,ses,labels=['2012-13','2013-14','2014-15','2015-16','2016-17'])
答案 2 :(得分:0)
您可以尝试更改日期的格式:
ses1 = pd.to_datetime('2017-01-08', format='%Y%b/%d')
答案 3 :(得分:0)
显然,您未在Date
中加载DateTime
列DataFrame ajax_t
。尝试转换它
ajax_t['Date'] = pd.to_datetime(ajax_t.Date)
或者,如果您从文件加载DataFrame ajax_t
,例如data.csv
文件,则可以指定参数以强制解析Date
列为DateTime
类型。
ajax_t = pd.read_csv('data.csv', parse_dates=['Date'])
希望这会有所帮助。