Python:熊猫数据框替换为另一个数据框

时间:2021-01-07 18:44:22

标签: python pandas dataframe

特定股票已更改股票代码,我希望在主数据框(df - 股票代码 CAP.DE)中使用来自数据框(df_enc - 股票代码 CAP.VI)的新数据,而不是错误数据。

我已经清理了主数据框 (df)。这是我的代码:

import pandas as pd
import numpy as np
from pandas_datareader import data as web
from datetime import datetime

# Get the stock symbols / tickers in the porfolio
# Assign the weights to the stocks

portfolio_value = 100 
stock_symbols = ['AAPL','GOOG','CAP.DE'] 
portfolio_weights = np.array([40,40,20])
portfolio_weights = (1/portfolio_value)*portfolio_weights

# Get the stock/portfolio starting date
stockStartDate = '2011-01-01'

# Get the stocks ending day (today)
today = datetime.today().strftime('%Y-%m-%d')

# Create a dataframe to store adjusted close price of the stocks
df = pd.DataFrame()

# Store the adjusted close price of the stocks
data_source='yahoo'
df = web.DataReader(name=stock_symbols, data_source=data_source, start=stockStartDate, end=today)['Adj Close']

# Clear false data from CAP.DE
df.loc['2021-01-04':,'CAP.DE'] = np.nan
df_enc = pd.DataFrame()
df_enc = web.DataReader(name = 'CAP.VI', data_source = data_source, start = '2021-01-04', end = today)['Adj Close']

我的代码的结果是:

Symbols           AAPL         GOOG    CAP.DE
Date                                         
2011-01-03   10.153708   301.046600  1.669983
2011-01-04   10.206702   299.935760  1.669983
2011-01-05   10.290195   303.397797  1.545708
2011-01-06   10.281874   305.604523  1.561246
2011-01-07   10.355506   307.069031  1.561246
               ...          ...       ...
2020-12-31  132.690002  1751.880005       NaN
2021-01-04  129.410004  1728.239990       NaN
2021-01-05  131.009995  1740.920044       NaN
2021-01-06  126.599998  1735.290039       NaN
2021-01-07  130.470001  1779.035034       NaN

结果 (df) 应如下所示:

Symbols           AAPL         GOOG    CAP.DE
Date                                         
2011-01-03   10.153708   301.046600  1.669983
2011-01-04   10.206702   299.935760  1.669983
2011-01-05   10.290195   303.397797  1.545708
2011-01-06   10.281874   305.604523  1.561246
2011-01-07   10.355506   307.069031  1.561246
               ...          ...       ...
2020-12-31  132.690002  1751.880005       NaN
2021-01-04  129.410004  1728.239990   21.049999
2021-01-05  131.009995  1740.920044   21.049999
2021-01-06  126.599998  1735.290039   22.549999
2021-01-07  130.470001  1779.035034   24.299999

1 个答案:

答案 0 :(得分:0)

这是用其他数据替换空值的方法

df = np.where(df['CAP.DE'].isnull(),df_enc['CAP.VI'],df['CAP.DE'])

或者如果您想替换 2021-01-04 以后的数据

val = df['CAP.DE']
val[val.index >= '2021-01-04'] = df_enc
df['CAP.DE'] = val