我的DataFrame类似于以下
Date ACH BABA BIDU CEA CHA CTRP EDU HNP
0 2000-06-30 $1.00 $3.00 $1.00 $0.00 $0.00 $0.00 $0.00 $0.00
1 2000-07-03 $3.00 $2.00 $6.20 $1.50 $0.00 $0.00 $0.00 $-0.48
2 2000-07-04 $5.00 $6.00 $3.00 $0.00 $0.00 $0.00 $0.00 $0.00
我试图使用以下方法计算每个百分比的变化:
df_vals = df[[ticker for ticker in tickers]].pct_change()
但是我收到以下错误
TypeError: unsupported operand type(s) for /: 'str' and 'str'
我假设我收到此错误,因为我有列标题,因此无法计算字符串。然后我尝试添加shift(可能也是错误的)
df_vals = df[[ticker for ticker in tickers]].shift(1).pct_change()
这会返回相同的错误。谢谢你的帮助。
答案 0 :(得分:3)
您需要replace
删除$
并首先转换为float
:
import pandas as pd
s = '''\
Date ACH BABA BIDU CEA CHA CTRP EDU HNP
2000-06-30 $1.00 $3.00 $1.00 $0.00 $0.00 $0.00 $0.00 $0.00
2000-07-03 $3.00 $2.00 $6.20 $1.50 $0.00 $0.00 $0.00 $-0.48
2000-07-04 $5.00 $6.00 $3.00 $0.00 $0.00 $0.00 $0.00 $0.00'''
# Recreate sample dataframe
df = pd.read_csv(pd.compat.StringIO(s),sep='\s+')
# Set index date (to not include) and remove all $
df = df.set_index('Date').replace('\$', '', regex=True).astype(float)
# Apply pct change and reset index
df = df.pct_change().reset_index()
print(df)
返回:
Date ACH BABA BIDU CEA CHA CTRP EDU \
0 2000-06-30 NaN NaN NaN NaN NaN NaN NaN
1 2000-07-03 2.000000 -0.333333 5.200000 inf NaN NaN NaN
2 2000-07-04 0.666667 2.000000 -0.516129 -1.000000 NaN NaN NaN
HNP
0 NaN
1 -inf
2 -1.000000