In [20]: df.head()
Out[20]:
year month capital sales income profit debt
0 2000 6 -19250379.0 37924704.0 -4348337.0 2571738.0 192842551.0
1 2000 12 -68357153.0 27870187.0 -49074146.0 -20764204.0 190848380.0
2 2001 6 -65048960.0 30529435.0 -1172803.0 2000427.0 197383572.0
3 2001 12 -90129943.0 17135480.0 -24208501.0 -1012230.0 191464941.0
4 2002 6 14671980.0 31377347.0 2188125.0 3660938.0 101355088.0
我尝试的是:
df['date'] = pd.to_datetime(df['year']*10000 + df['month']*100, format="%Y%m")
但是发生了错误:
In [21]: df['date'] = pd.to_datetime(df['year']*10000 + df['month']*100, format="%Y%m"
...: )
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-21-31bfca8c5941> in <module>()
----> 1 df['date'] = pd.to_datetime(df['year']*10000 + df['month']*100, format="%Y%m")
~/.pyenv/versions/3.6.0/lib/python3.6/site-packages/pandas/core/tools/datetimes.py in to_datetime(arg, errors, dayfirst, yearfirst, utc, box, format, exact, unit, infer_datetime_format, origin)
507 elif isinstance(arg, ABCSeries):
508 from pandas import Series
--> 509 values = _convert_listlike(arg._values, False, format)
510 result = Series(values, index=arg.index, name=arg.name)
511 elif isinstance(arg, (ABCDataFrame, MutableMapping)):
~/.pyenv/versions/3.6.0/lib/python3.6/site-packages/pandas/core/tools/datetimes.py in _convert_listlike(arg, box, format, name, tz)
412 try:
413 result = tslib.array_strptime(arg, format, exact=exact,
--> 414 errors=errors)
415 except tslib.OutOfBoundsDatetime:
416 if errors == 'raise':
pandas/_libs/tslib.pyx in pandas._libs.tslib.array_strptime (pandas/_libs/tslib.c:63753)()
TypeError: 'int' object is unsliceable
我认为这是因为错过了'日'。
我该怎么做?
答案 0 :(得分:1)
In [223]: df['date'] = pd.to_datetime(df[['year','month']].assign(day=1)).dt.to_period('M')
In [224]: df
Out[224]:
year month capital sales income profit debt date
0 2000 6 -19250379.0 37924704.0 -4348337.0 2571738.0 192842551.0 2000-06
1 2000 12 -68357153.0 27870187.0 -49074146.0 -20764204.0 190848380.0 2000-12
2 2001 6 -65048960.0 30529435.0 -1172803.0 2000427.0 197383572.0 2001-06
3 2001 12 -90129943.0 17135480.0 -24208501.0 -1012230.0 191464941.0 2001-12
4 2002 6 14671980.0 31377347.0 2188125.0 3660938.0 101355088.0 2002-06
或
In [208]: df['date'] = pd.PeriodIndex(pd.to_datetime(df[['year','month']].assign(day=1)),
freq='M')
In [209]: df
Out[209]:
year month capital sales income profit debt date
0 2000 6 -19250379.0 37924704.0 -4348337.0 2571738.0 192842551.0 2000-06
1 2000 12 -68357153.0 27870187.0 -49074146.0 -20764204.0 190848380.0 2000-12
2 2001 6 -65048960.0 30529435.0 -1172803.0 2000427.0 197383572.0 2001-06
3 2001 12 -90129943.0 17135480.0 -24208501.0 -1012230.0 191464941.0 2001-12
4 2002 6 14671980.0 31377347.0 2188125.0 3660938.0 101355088.0 2002-06