我有一个数据框df
,其中包含以下信息:
DateTime MDate Fwd Type
1/4/2010 2/1/2010 61.17 A
1/5/2010 2/1/2010 59.73 A
1/6/2010 2/1/2010 62.2 A
1/7/2010 2/1/2010 61.1 A
1/8/2010 2/1/2010 60.25 A
1/11/2010 2/1/2010 57.12 A
1/12/2010 2/1/2010 57.35 A
1/13/2010 2/1/2010 58.12 B
1/14/2010 2/1/2010 57.12 B
1/15/2010 2/1/2010 59.38 B
8/1/2013 5/1/2014 57.67 B
8/2/2013 5/1/2014 57.25 B
8/3/2013 5/1/2014 57.9 B
8/4/2013 5/1/2014 59.25 B
8/5/2013 5/1/2014 57.67 B
我要创建以下内容:
DateTime MDate Fwd Type pctChange
1/4/2010 2/1/2010 61.17 A
1/5/2010 2/1/2010 59.73 A (0.02)
1/6/2010 2/1/2010 62.2 A 0.04
1/7/2010 2/1/2010 61.1 A (0.02)
1/8/2010 2/1/2010 60.25 A (0.01)
1/11/2010 2/1/2010 57.12 A (0.05)
1/12/2010 2/1/2010 57.35 A 0.00
1/13/2010 2/1/2010 58.12 B
1/14/2010 2/1/2010 57.12 B (0.02)
1/15/2010 2/1/2010 59.38 B 0.04
8/1/2013 5/1/2014 57.67 B
8/2/2013 5/1/2014 57.25 B (0.01)
8/3/2013 5/1/2014 57.9 B 0.01
8/4/2013 5/1/2014 59.25 B 0.02
8/5/2013 5/1/2014 57.67 B (0.03)
我想基于(MDate, Type)
组来隔离时间序列并计算pctChgange
因此,在上面的示例中,第一个组的创建如下。它的所有行都具有相同的MDate
和Type
:
DateTime MDate Fwd Type pctChange
1/4/2010 2/1/2010 61.17 A
1/5/2010 2/1/2010 59.73 A (0.02)
1/6/2010 2/1/2010 62.2 A 0.04
1/7/2010 2/1/2010 61.1 A (0.02)
1/8/2010 2/1/2010 60.25 A (0.01)
1/11/2010 2/1/2010 57.12 A (0.05)
1/12/2010 2/1/2010 57.35 A 0.00
我将pctChange
计算为59.73/61.17 - 1 = (0.02)
我正在考虑实施以下版本:
import pandas as pd
df2 = pd.pivot_table(df, index=['MDate', 'Type'], values=['Fwd'], aggfunc=someFunction)
我无法确定我可以为someFunction
实现什么功能
答案 0 :(得分:1)
这应该做到:
df[['MDate', 'DateTime']] = df[['MDate', 'DateTime']].apply(lambda x: pd.to_datetime(x, infer_datetime_format=True))
df['pctChange'] = df.groupby(['MDate', 'Type'])['Fwd'].transform(pd.Series.pct_change).fillna('').apply(lambda x: '({0:.2f})'.format(-x) if x < 0 else '{0:.2f}'.format(x) if x else x)
df
# DateTime Fwd MDate Type pctChange
#0 2010-01-04 61.17 2010-02-01 A
#1 2010-01-05 59.73 2010-02-01 A (0.02)
#2 2010-01-06 62.20 2010-02-01 A 0.04
#3 2010-01-07 61.10 2010-02-01 A (0.02)
#4 2010-01-08 60.25 2010-02-01 A (0.01)
#5 2010-01-11 57.12 2010-02-01 A (0.05)
#6 2010-01-12 57.35 2010-02-01 A 0.00
#7 2010-01-13 58.12 2010-02-01 B
#8 2010-01-14 57.12 2010-02-01 B (0.02)
#9 2010-01-15 59.38 2010-02-01 B 0.04
#10 2013-08-01 57.67 2014-05-01 B
#11 2013-08-02 57.25 2014-05-01 B (0.01)
#12 2013-08-03 57.90 2014-05-01 B 0.01
#13 2013-08-04 59.25 2014-05-01 B 0.02
#14 2013-08-05 57.67 2014-05-01 B (0.03)
第一行将MDate
和DateTime
转换为datetime
,因为我不确定它们的格式是否正确。