R循环使用bp函数

时间:2013-07-22 08:27:21

标签: r

我有一个包含3列的数据集:country,year和tdvalue。 我希望按国家/地区创建一个循环来创建一个虚拟变量(sd),如果年份是断点,则使用R breakpoint函数,该变量具有1或0。 但是,当我使我的代码工作时,我的sd变量总是等于0,而我知道这是几年的情况?

非常感谢你的帮助!

library(zoo)
library(sandwich)
library(strucchange)
library(segmented)
library(tree)

tabo<-read.table("boucle.txt", header=T, sep="\t")

Fonction.bp<-function(b)
  bp.inf <- breakpoints(tabo$year ~ tabo$tradevaluein1000usd , tabo = tabo[b,], h = 8)
  t<-breakdates(confint(bp.inf))
  for (i in 1:nrow(t)) {
    res <- ifelse(tabo$year[b] == t[i,1] , 1, 0)
    return(res)
  }
}

numero<-1:nrow(tabo)
tabo$sd<-lapply(tabo$code_o,Fonction.bp)

数据样本:

code_o -origin  -year   -tradevaluein1000usd

ABW Aruba   1988    375.059
ABW Aruba   1989    3458.656
ABW Aruba   1990    2924.484
ABW Aruba   1991    140509.4
几个国家的

  

dput(TABO):

structure(list(code_o = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L), .Label = c("ABW", "AFG", "AGO"), class = "factor"), 
    origin = structure(c(3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
    3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
    3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L, 2L, 2L, 2L), .Label = c("Afghanistan", "Angola", 
    "Aruba"), class = "factor"), year = c(1988L, 1989L, 1990L, 
    1991L, 1992L, 1993L, 1994L, 1995L, 1996L, 1997L, 1998L, 1999L, 
    2000L, 2001L, 2002L, 2003L, 2004L, 2005L, 2006L, 2007L, 2008L, 
    2009L, 2010L, 2011L, 2012L, 1988L, 1989L, 1990L, 1991L, 1992L, 
    1993L, 1994L, 1995L, 1996L, 1997L, 1998L, 1999L, 2000L, 2001L, 
    2002L, 2003L, 2004L, 2005L, 2006L, 2007L, 2008L, 2009L, 2010L, 
    2011L, 2012L, 1988L, 1989L, 1990L, 1991L, 1992L, 1993L, 1994L, 
    1995L, 1996L, 1997L, 1998L, 1999L, 2000L, 2001L, 2002L, 2003L, 
    2004L, 2005L, 2006L, 2007L, 2008L, 2009L, 2010L, 2011L, 2012L
    ), tradevaluein1000usd = c(375.059, 3458.656, 2924.484, 140509.4, 
    326377, 548739.3, 570287.9, 673563.2, 809647.7, 1021996, 
    680243.7, 944974.8, 1950097, 1416807, 1055372, 1276015, 2503752, 
    3908081, 4294362, 4654180, 5523432, 2203173, 272596.5, 4450387, 
    127760.6, 121861.2, 125059.8, 134163.4, 115283.5, 82499.51, 
    68673.89, 97143.18, 104883.2, 124654.5, 155892.9, 167802.9, 
    137721, 153405.3, 99146.39, 103894.9, 190640.9, 209073.9, 
    264083.6, 254765.3, 408123.6, 507407, 1283451, 609946.1, 
    486418.4, 67638.02, 1112926, 3120863, 4082248, 3290223, 3796494, 
    3283747, 3175830, 3614761, 4669298, 4618304, 3501481, 4478671, 
    7878114, 6290144, 7344164, 8563406, 11900000, 20700000, 30200000, 
    39500000, 65700000, 38900000, 50600000, 59400000, 8839811
    ), sd = list(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
        0, 0, 0, 0, 0, 0)), .Names = c("code_o", "origin", "year", 
"tradevaluein1000usd", "sd"), row.names = c(NA, -75L), class = "data.frame")

1 个答案:

答案 0 :(得分:0)

你有糟糕的代码。

如果你期望别人的时间,你必须投入更多精力。

您的功能不起作用:tabo['AGO',]tabo['AFG',]tabo['ABW',]都是空的,因为没有带有这些名称的行。我想您可能希望使用以下内容对数据进行子集化:

tabo[tabo$code_o == 'AGO',]
tabo[tabo$code_o == 'AFG',]
tabo[tabo$code_o == 'ABW',]
无论我们是否包含代码bp.inf

tabo = tabo[b,]都是相同的 - 因为您从全局环境调用项目回归,而不是传入数据框(如你给的是tabo而不是data)。如果这令人困惑,请忘记它......

最重要的是,该行中有一些找到断点的错误。您需要更改为bp.inf <- breakpoints(year ~ tradevaluein1000usd, data = tabo[tabo$code_o == 'AGO',], h = 8)之类的内容。

另请注意,您的功能无法使用{打开,因此根本不起作用。