在Stata中复制Fama French Factor

时间:2018-03-20 11:50:42

标签: finance quantitative-finance

我正在尝试复制Fama-French营业利润因子(RMW)。我已经写了Stata代码,并在下面的图中得到了结果。作为参考,相关性仅为约0.909。我一直试图改善结果几周,但无法取得任何进展。我想知道我是否可以在这里得到一些建议。

非常感谢!

***Replicating RMW***
use "compustat.dta"

*Process preferred stock*
gen ps = pstkrv
replace ps = pstkl if ps == .
replace ps = pstk if ps == .
replace ps = 0 if ps == .
sum ps, d

*Process stockholder equity*
gen se = seq
replace se = ceq + pstk if se == .
replace se = at - lt if se == .
replace se = bkvlps * csho + pstkrv if se == .
replace se = bkvlps * csho + pstkl if se == .
replace se = bkvlps * csho + pstk if se == .
replace se = bkvlps * csho if se == .
sum se, d

*Process book equity***
replace itcb = 0 if itcb == .
replace txdb = 0 if txdb == .
gen be = se + txdb + itcb - ps
sum be, d

*Process operating profit*
gen n = 0
replace n = 1 if cogs != . | xsga != . | xint != .
replace n = 0 if n == 1 & revt == .
keep if n==1
drop n
keep if be !=.
replace cogs = 0 if cogs == .
replace xsga = 0 if xsga == .
replace xint = 0 if xint == .
drop if be < 0
gen op = 100*(revt - cogs - xsga - xint)/be
drop if op == .
gen n = 1
bys gvkey fyear: egen N = sum(n)
keep if N == 1
keep gvkey datadate fyear be op
append using "Multiple_Observations_Per_Year_processed.dta"
save "OP.dta",replace 

*CRSP Side Data Cleaning*
use "CRSP Monthly Return_2.dta", clear
gen year = year(date)
gen month = month(date)
gen modate = ym(year,month)
format modate %tm

keep if shrcd == 10 | shrcd == 11
keep if exchcd == 1 | exchcd == 2 | exchcd == 3
gen age = modate

*Adjust for Delisting*
replace ret = 0 if ret == . & dlret != .
replace dlret = 0 if dlret ==.
gen retadj = (1 + ret)*(1+dlret) - 1
replace ret = retadj
drop if ret ==.
gen price = abs(prc)
gen size = price*shrout

*Dealing with CRSP issue of multiple PERMNO for one PERMCO*
*Following conventional method, sum up market cap for permnos with the same permco, and assign to the permno with largest market cap. Keep that permno.*
gen n = 1
bys permco modate: egen N = sum(n)
sum N
preserve
keep if N > 1
bys permco modate: egen sum_size = sum(size)
bys permco modate: egen rank = rank(-size)
keep if rank == 1 | rank == 1.5
duplicates drop permno modate, force
drop rank
replace size = sum_size
drop sum_size
save "crsp_multiple_permno_2.dta",replace
restore
keep if N == 1
append using "crsp_multiple_permno_2.dta"
drop retadj n N
save "CRSP Monthly Return_2_processed.dta",replace

*Link Compustat accounting OP to CRSP data*
use "OP.dta", clear
merge m:m gvkey using"link_primary"
keep if _merge==3
drop _merge
gen link_year = year( linkdt)
gen end_year = year( linkenddt )
gen year = fyear + 1
keep if year >= link_year
keep if fyear <= end_year | end_year == .
rename N age_in_compustat
keep gvkey fyear op permno year age_in_compustat
duplicates drop permno year, force
merge 1:1 permno year using"June_Size_2.dta"
keep if _merge == 3
drop _merge
drop if size == .
tab year
tab fyear
drop if fyear < 1961

*Generate 2x3 portfolio using f-f breakpoints*
merge m:1 year using"OP_Breakpoints.dta"
keep if _merge==3
drop _merge
gen p_3 = .
replace p_3 = 1 if op <= thirty
replace p_3 = 2 if op > thirty & op < seventy
replace p_3 = 3 if op >= seventy
merge m:1 year month using"ME_Breakpoints.dta"
keep if _merge==3
drop _merge
replace size = size/1000
gen s_2 = .
replace s_2 = 1 if size <= fifty
replace s_2 = 2 if size > fifty
keep gvkey permno year p_3 s_2 age_in_compustat op
rename year myear
save "2x3 sorts.dta", replace

*Generate monthly portfolio returns*
use"CRSP Monthly Return_2_processed.dta", clear
drop shrcd exchcd prc shrout
gen myear = year
replace myear = year - 1 if month<=6
merge m:1 permno myear using"2x3 sorts.dta"
drop if _merge == 2
drop _merge
*drop if p_3 == 2
duplicates drop permno modate, force

*Weighting alternative #1*
tsset permno modate
gen l_size = l.size


*Weighting alternative #2*
tsset permno modate
gen l2_size = l2.size
gen l_retx = l.retx
gen l_size = l2_size*(1+l_retx)


*Generate returns*
preserve
collapse (mean) vwret = ret [w=l_size] if p_3 == 1, by(modate s_2)
reshape wide vwret, i(modate) j(s_2)
gen ret_weak = (vwret1 + vwret2) / 2
replace ret_weak = ret_weak * 100
keep modate ret_weak
save "ret_weak.dta",replace
restore

preserve
collapse (mean) vwret = ret [w=l_size] if p_3 == 3, by(modate s_2)
reshape wide vwret, i(modate) j(s_2)
gen ret_robust = (vwret1 + vwret2) / 2
replace ret_robust = ret_robust * 100
keep modate ret_robust
save "ret_robust.dta",replace

*Compare replicated RMW with actual rmw*
merge 1:1 modate using"ret_weak.dta"
drop _merge


gen RMW = ret_robust - ret_weak
sum RMW
merge 1:1 modate using"f-f rmw.dta"
keep if _merge == 3
drop _merge

gen diff = RMW-rmw
sum diff

*Plotting RMW against rmw*
gen age = modate

gen RMW_ret = (100+RMW)/100
gen rmw_ret = (100+rmw)/100

sort modate
gen cum_ret_RMW = .

replace cum_ret_RMW = RMW_ret if age == 42
replace cum_ret_RMW = RMW_ret * cum_ret_RMW[_n-1] if cum_ret_RMW == .

sort modate
gen cum_ret_rmw = .
replace cum_ret_rmw= rmw_ret if age == 42
replace cum_ret_rmw = rmw_ret * cum_ret_rmw[_n-1] if cum_ret_rmw == .

tsset modate
tsline cum_ret_RMW cum_ret_rmw

Cumulative Return (my replication is in blue, original factor in red)

1 个答案:

答案 0 :(得分:0)

我发现断点必须仅基于纽约证券交易所的股票,然后才能应用于其他证券交易所。经过这种改进,我的相关性从90%增加到96%。

希望有帮助。