我有一个csv文件,其中包含以下列:
日期| Mkt-RF | SMB | HML | RF | C | aig-RF | ford-RF | ibm-RF | xom-RF |
我正在尝试在python中运行多个OLS回归,例如对“ aig-RF”上的“ Mkt-RF”,“ SMB”和“ HML”进行回归。
似乎我需要首先从数组中筛选出DataFrame,但我似乎不明白如何:
#回归
x = df[['Mkt-RF','SMB','HML']]
y = df['aig-RF']
df = pd.DataFrame({'x':x, 'y':y})
df['constant'] = 1
df.head()
sm.OLS(y,df[['constant','x']]).fit().summary()
完整代码为:
将numpy导入为np 将熊猫作为pd导入 从熊猫导入DataFrame 从sklearn导入linear_model 将statsmodels.api导入为sm
def ReadFF(sIn): “” 目的: 读取FF数据
Inputs:
sIn string, name of input file
Return value:
df dataframe, data
"""
df= pd.read_csv(sIn, header=3, names=["Date","Mkt-RF","SMB","HML","RF"])
df= df.dropna(how='any')
# Reformat the dates, as date-time, and place them as index
vDate= pd.to_datetime(df["Date"].values,format='%Y%m%d')
df.index= vDate
# Add in a constant
iN= len(vDate)
df["C"]= np.ones(iN)
print(df)
return df
def JoinStock(df,sStock,sPer): “” 目的: 将股票加入数据框,以获取超额回报
Inputs:
df dataframe, data including RF
sStock string, name of stock to read
sPer string, extension indicating period
Return value:
df dataframe, enlarged
"""
df1= pd.read_csv(sStock+"_"+sPer+".csv", index_col="Date", usecols=["Date", "Adj Close"])
df1.columns= [sStock]
# Add prices to original dataframe, to get correct dates
df= df.join(df1, how="left")
# Extract returns
vR= 100*np.diff(np.log(df[sStock].values))
# Add a missing, as one observation was lost differencing
vR= np.hstack([np.nan, vR])
# Add excess return to dataframe
df[sStock + "-RF"]= vR - df["RF"]
print(df)
return df
def SaveFF(df,asStock,sOut): “” 目的: 保存用于FF回归的数据
Inputs:
df dataframe, all data
asStock list of strings, stocks
sOut string, output file name
Output:
file written to disk
"""
df= df.dropna(how='any')
asOut= ['Mkt-RF', 'SMB', 'HML', 'RF', 'C']
for sStock in asStock:
asOut.append(sStock+"-RF")
print ("Writing columns ", asOut, "to file ", sOut)
df.to_csv(sOut, columns=asOut, index_label="Date", float_format="%.8g")
print(df)
return df
def main():
sPer= "0018"
sIn= "Research_Data_Factors_weekly.csv"
sOut= "ffstocks"
asStock= ["aig", "ford", "ibm", "xom"]
# Initialisation
df= ReadFF(sIn)
for sStock in asStock:
df= JoinStock(df, sStock, sPer)
# Output
SaveFF(df, asStock, sOut+"_"+sPer+".csv")
print ("Done")
# Regression
x = df[['Mkt-RF','SMB','HML']]
y = df['aig-RF']
df = pd.DataFrame({'x':x, 'y':y})
df['constant'] = 1
df.head()
sm.OLS(y,df[['constant','x']]).fit().summary()
为了获得多个OLS回归表,我到底需要在pd.DataFrame中进行哪些修改?
答案 0 :(得分:0)
我建议将您的代码的第一部分更改为以下内容(主要是交换订单):
paths:
/api/assignment:
post:
tags:
- Assignment
summary: "Endpoint to create Resources in system"
description: "This endpoint will create blah blah"
operationId: CreateResource
requestBody: # <-----------
required: true
content:
application/json:
schema:
type: object
properties:
Ganesh:
type: integer
Test:
type: string
RefClaim:
type: object # <-----------
properties: # <-----------
Data1:
type: object # <-----------
properties: # <-----------
FirstName:
type: string
LastName:
type: string
Data2:
type: object # <-----------
properties: # <-----------
FirstName2:
type: string
LastName2:
type: string
希望这会有所帮助。