我对C ++和RCPP集成还很陌生。我需要使用具有R集成的C ++创建一个程序,以查找Cauchy分布的MLE /根。
到目前为止,以下是我的代码。
#include <Rcpp.h>
#include <math.h>
#include <iostream>
#include <cstdlib>
using namespace std;
using namespace Rcpp;
// [[Rcpp::export]]
double Cauchy(double x, double y); //Declare Function
double Cauchy(double x,double y) //Define Function
{
return 1/(M_PI*(1+(pow(x-y,2)))); //write the equation whose roots are
to be determined x=chosen y=theta
}
using namespace std;
// [[Rcpp::export]]
int Secant (NumericVector x){
NumericVector xvector(x) ; //Input of x vector
double eplison= 0.001; //Threshold
double a= xvector[3]; //Select starting point
double b= xvector[4];//Select end point
double c= 0.0; //initial value for c
double Theta= 10.6; //median value for theta estimate
int noofIter= 0; //Iterations
double error = 0.0;
if (std::abs(Cauchy(a, Theta)<(std::abs(Cauchy(a, Theta))))){
do{
a=b;
b=c;
error= (b-(Cauchy(b, Theta)))*((a-b)/(Cauchy(a, Theta)-Cauchy(b, Theta)));
error= Cauchy(c,Theta);
//return number of iterations
noofIter++;
for (int i = 0; i < noofIter; i += 1) {
cout << "The Value is " << c << endl;
cout << "The Value is " << a << endl;
cout << "The Value is " << b << endl;
cout << "The Value is " << Theta << endl;
}
}while (std::abs(error)>eplison);
}
cout<<"\nThe root of the equation is occurs at "<<c<<endl; //print the
root
cout << "The number of iterations is " << noofIter;
return 0;
}``
有一些修改,程序要么进入永无止境的循环,要么返回一个无限小的值。
我对这种数学的理解是有限的。因此,任何帮助或纠正都将不胜感激。
作为输出给出的X向量是
x <- c( 11.262307 , 10.281078 , 10.287090 , 12.734039 ,
11.731881 , 8.861998 , 12.246509 , 11.244818 ,
9.696278 , 11.557572 , 11.112531 , 10.550190 ,
9.018438 , 10.704774 , 9.515617 , 10.003247 ,
10.278352 , 9.709630 , 10.963905 , 17.314814)
使用先前的R代码,我们知道此分布的MLE /根大约为10.5935
用于获取此MLE的代码为
optimize(function(theta)-sum(dcauchy(x, location=theta,
log=TRUE)), c(-100,100))
谢谢!
答案 0 :(得分:2)
使用optimize()
函数可直接搜索可能性的极值。一种替代方法是将根查找算法(例如割线方法)与(log-)可能性的导数一起使用。从Wikipedia中,我们得到了必须解决的公式。在R中,它可能看起来像这样:
x <- c( 11.262307 , 10.281078 , 10.287090 , 12.734039 ,
11.731881 , 8.861998 , 12.246509 , 11.244818 ,
9.696278 , 11.557572 , 11.112531 , 10.550190 ,
9.018438 , 10.704774 , 9.515617 , 10.003247 ,
10.278352 , 9.709630 , 10.963905 , 17.314814)
ld <- function(sample, theta){
xp <- outer(sample, theta, FUN = "-")
colSums(xp/(1+xp^2))
}
uniroot(ld, sample = x, lower = 0, upper = 20)$root
#> [1] 10.59724
请注意,对数似然的导数在两个参数上均被向量化。这样可以轻松绘制:
theta <- seq(0, 20, length=500)
plot(theta, ld(x, theta), type="l",
xlab=expression(theta), ylab=expression(ld(x, theta)))
从图中可以看出,找到正确的切线方法开始工作很困难。
让我们将其移至C ++(准确地说是C ++ 11):
#include <Rcpp.h>
// [[Rcpp::plugins(cpp11)]]
Rcpp::List secant(const std::function<double(double)>& f,
double a, double b, int maxIterations, double epsilon) {
double c(0.0);
do {
c = b * (1 - (1 - a/b) / (1 - f(a)/f(b)));
a = b;
b = c;
} while (maxIterations-- > 0 && std::abs(a - b) > epsilon);
return Rcpp::List::create(Rcpp::Named("root") = c,
Rcpp::Named("f.root") = f(c),
Rcpp::Named("converged") = (maxIterations > 0));
}
// [[Rcpp::export]]
Rcpp::List mleCauchy(const Rcpp::NumericVector& sample, double a, double b,
int maxIterations = 100, double epsilon = 0.0001) {
auto f = [&sample](double theta) {
Rcpp::NumericVector xp = sample - theta;
xp = xp / (1 + xp * xp);
return Rcpp::sum(xp);
};
return secant(f, a, b, maxIterations, epsilon);
}
/*** R
x <- c( 11.262307 , 10.281078 , 10.287090 , 12.734039 ,
11.731881 , 8.861998 , 12.246509 , 11.244818 ,
9.696278 , 11.557572 , 11.112531 , 10.550190 ,
9.018438 , 10.704774 , 9.515617 , 10.003247 ,
10.278352 , 9.709630 , 10.963905 , 17.314814)
mleCauchy(x, 11, 15)
#-> does not converge
mleCauchy(x, 11, 14)
#-> 10.59721
mleCauchy(x, mean(x), median(x))
#-> 10.59721
*/
secant()
函数适用于以double
作为参数并返回double
的任何std::function
。然后,将这样的函数定义为lambda function,具体取决于提供的样本值。正如预期的那样,只有从接近正确值的值开始才获得正确的根。
Lambda函数乍看之下可能会有些混乱,但是它们与我们在R中使用的功能非常接近。这里使用R编写的相同算法:
secant <- function(f, a, b, maxIterations, epsilon) {
for (i in seq.int(maxIterations)) {
c <- b * (1 - (1 - a/b) / (1 - f(a)/f(b)))
a <- b
b <- c
if (abs(a - b) <= epsilon)
break
}
list(root = c, f.root = f(c), converged = (i < maxIterations))
}
mleCauchy <- function(sample, a, b, maxIterations = 100L, epsilon = 0.001) {
f <- function(theta) {
xp <- sample - theta
sum(xp/(1 + xp^2))
}
secant(f, a, b, maxIterations, epsilon)
}
x <- c( 11.262307 , 10.281078 , 10.287090 , 12.734039 ,
11.731881 , 8.861998 , 12.246509 , 11.244818 ,
9.696278 , 11.557572 , 11.112531 , 10.550190 ,
9.018438 , 10.704774 , 9.515617 , 10.003247 ,
10.278352 , 9.709630 , 10.963905 , 17.314814)
mleCauchy(x, 11, 12)
#-> 10.59721
R函数f
和lambda函数f
从定义它们的环境中获取向量sample
。在R中,这是隐式发生的,而在C ++中,我们必须明确地告知应捕获此值。数字theta
是调用函数时提供的参数,即以a
和b
开头的根的连续估计。