大熊猫的百分比变化

时间:2018-02-24 20:11:02

标签: python-3.x pandas

我的DataFrame类似于以下

      Date         ACH   BABA   BIDU    CEA    CHA   CTRP    EDU    HNP  
0     2000-06-30  $1.00  $3.00  $1.00  $0.00  $0.00  $0.00  $0.00  $0.00   
1     2000-07-03  $3.00  $2.00  $6.20  $1.50  $0.00  $0.00  $0.00 $-0.48   
2     2000-07-04  $5.00  $6.00  $3.00  $0.00  $0.00  $0.00  $0.00  $0.00 

我试图使用以下方法计算每个百分比的变化:

df_vals = df[[ticker for ticker in tickers]].pct_change()

但是我收到以下错误

TypeError: unsupported operand type(s) for /: 'str' and 'str'

我假设我收到此错误,因为我有列标题,因此无法计算字符串。然后我尝试添加shift(可能也是错误的)

df_vals = df[[ticker for ticker in tickers]].shift(1).pct_change()

这会返回相同的错误。谢谢你的帮助。

1 个答案:

答案 0 :(得分:3)

您需要replace删除$并首先转换为float

import pandas as pd

s = '''\
Date        ACH   BABA   BIDU    CEA    CHA   CTRP    EDU    HNP  
2000-06-30  $1.00  $3.00  $1.00  $0.00  $0.00  $0.00  $0.00  $0.00   
2000-07-03  $3.00  $2.00  $6.20  $1.50  $0.00  $0.00  $0.00 $-0.48   
2000-07-04  $5.00  $6.00  $3.00  $0.00  $0.00  $0.00  $0.00  $0.00'''

# Recreate sample dataframe
df = pd.read_csv(pd.compat.StringIO(s),sep='\s+')

# Set index date (to not include) and remove all $
df = df.set_index('Date').replace('\$', '', regex=True).astype(float)

# Apply pct change and reset index
df = df.pct_change().reset_index()

print(df)

返回:

         Date       ACH      BABA      BIDU       CEA  CHA  CTRP  EDU  \
0  2000-06-30       NaN       NaN       NaN       NaN  NaN   NaN  NaN   
1  2000-07-03  2.000000 -0.333333  5.200000       inf  NaN   NaN  NaN   
2  2000-07-04  0.666667  2.000000 -0.516129 -1.000000  NaN   NaN  NaN   

        HNP  
0       NaN  
1      -inf  
2 -1.000000