将R列表作为宏添加到Stata中?

时间:2018-05-30 21:00:41

标签: r stata stata-macros

我希望从Stata中运行R中的Lasso模型,然后将结果字符列表(子集系数的名称)作为宏(例如,全局)返回到Stata中。

目前我知道有两个选择:

  1. 我使用dta保存shell文件并从Stata运行R脚本:

    shell $Rloc --vanilla <"${LOC}/Lasso.R"
    

    这可以从保存的dta文件中运行,并允许我运行我希望运行的Lasso模型,但不是交互式的,因此我无法提供相关的字符列表(名称为子集变量)回到Stata。

  2. 我使用rcall从Stata以交互方式运行R.但是,rcall不允许我加载足够大的矩阵,即使在最大Stata内存下也是如此。我的预测矩阵Z(由Lasso作为子集)是1,000乘100但是当我运行命令时:

    rcall: X <- st.matrix(Z) 
    

    我收到错误说明:

      

    宏替换导致行太长:替换宏所产生的行将比允许的长。允许的最大长度为645,216个字符,这是根据set maxvar计算的。

  3. 有没有办法从Stata交互式运行R,它允许使用大型矩阵,这样我可以将R中的字符列表作为宏带回Stata?

    提前致谢。

1 个答案:

答案 0 :(得分:2)

下面我将尝试在一个非常有用的答案中合并评论。

不幸的是,rcall似乎不适合您需要的大型矩阵。我认为最好使用shell命令调用R来运行脚本,并将字符串保存为dta文件中的变量。这需要更多的工作,但它肯定是可编程的。

然后你可以将这些变量读入Stata并使用内置函数轻松操作它们。例如,您可以将字符串保存在单独的变量中,也可以保存在一个变量中,并使用levelsof作为@Dimitriy推荐。

考虑以下玩具示例:

clear
set obs 5

input str50 string
"this is a string"
"A longer string is this"
"A string that is even longer is this one"
"How many strings do you have?"
end

levelsof string, local(newstr) 
`"A longer string is this"' `"A string that is even longer is this one"' `"How many strings do you have?"' `"this is a string"'

tokenize `"`newstr'"'

forvalues i = 1 / `: word count `newstr'' {
    display "``i''"
}

A longer string is this
A string that is even longer is this one
How many strings do you have?
this is a string

根据我的经验,rcallrsource等程序对简单任务非常有用。然而,它们可能成为更复杂工作的真正麻烦,在这种情况下,我个人只是诉诸真实的东西,即直接使用其他软件。

正如@Dimitriy所指出的那样,现在有一些社区贡献的命令可用于lasso,这可能会满足您的需求,因此您不必使用R:

search lasso

5 packages found (Stata Journal and STB listed first)
-----------------------------------------------------

elasticregress from http://fmwww.bc.edu/RePEc/bocode/e
    'ELASTICREGRESS': module to perform elastic net regression, lasso
    regression, ridge regression / elasticregress calculates an elastic
    net-regularized / regression: an estimator of a linear model in which
    larger / parameters are discouraged.  This estimator nests the LASSO / and

lars from http://fmwww.bc.edu/RePEc/bocode/l
    'LARS': module to perform least angle regression / Least Angle Regression
    is a model-building algorithm that / considers parsimony as well as
    prediction accuracy.  This / method is covered in detail by the paper
    Efron, Hastie, Johnstone / and Tibshirani (2004), published in The Annals

lassopack from http://fmwww.bc.edu/RePEc/bocode/l
    'LASSOPACK': module for lasso, square-root lasso, elastic net, ridge,
    adaptive lasso estimation and cross-validation / lassopack is a suite of
    programs for penalized regression / methods suitable for the
    high-dimensional setting where the / number of predictors p may be large

pdslasso from http://fmwww.bc.edu/RePEc/bocode/p
    'PDSLASSO': module for post-selection and post-regularization OLS or IV
    estimation and inference / pdslasso and ivlasso are routines for
    estimating structural / parameters in linear models with many controls
    and/or / instruments. The routines use methods for estimating sparse /

sivreg from http://fmwww.bc.edu/RePEc/bocode/s
    'SIVREG': module to perform adaptive Lasso with some invalid instruments /
    sivreg estimates a linear instrumental variables regression / where some
    of the instruments fail the exclusion restriction / and are thus invalid.
    The LARS algorithm (Efron et al., 2004) is / applied as long as the Hansen