R - 使用带标签的异常值创建多列框图

时间:2016-08-18 20:09:50

标签: r

我正在尝试创建带有标记异常值的boxplot。我的数据形式很长:

head(data.basic)
   STATE      variable  value
1    ALABAMA  FY_1998 0.363746457
2     ALASKA  FY_1998 0.632334359
3    ARIZONA  FY_1998 0.512511586
4   ARKANSAS  FY_1998 0.485002318
5 CALIFORNIA  FY_1998 0.696569322
6   COLORADO  FY_1998 0.351297291

目标是为每个变量创建一个箱线图(即" FY_1998":" FY_2013")。使用默认的boxplot函数很简单,但结果不包括带标签的异常值。 Car包装中的Boxplot公式更具挑战性。我能够使用以下代码以宽泛的形式创建一年的箱线图:

Boxplot(Basic.Assistance[["FY_1998"]], labels=rownames(Basic.Assistance))

但是,我无法将此方法扩展到其他变量,因此尝试将Boxplot与长格式数据一起使用。任何有关扩展上述方法以包括FY_1998到FY_2013或使用长格式代码编写相同结果的代码的帮助都将非常感激。

我是R和这个论坛的新手,并且为了省略任何必要的材料而道歉。如果需要更多代码或信息,请告诉我 - 提前感谢。

summary(data.basic)
STATE             variable            value          
 Length:832         Length:832         Length:832        
 Class :character   Class :character   Class :character  
 Mode  :character   Mode  :character   Mode  :character  
> tail(data.basic)
        STATE variable       value
827      VIRGINIA  FY_2013 0.346652203
828    WASHINGTON  FY_2013 0.215769738
829 WEST_VIRGINIA  FY_2013 0.219831256
830     WISCONSIN  FY_2013 0.226368331
831       WYOMING  FY_2013 0.153766717
832 AVERAGE_STATE  FY_2013 0.235787342

1 个答案:

答案 0 :(得分:0)

%%%%% Shot 1:悲惨地失败了 怎么样:

data.basic <- read.table(head=TRUE, text="
        STATE variable       value
      ALABAMA  FY_1998 0.363746457
       ALASKA  FY_1998 0.632334359
      ARIZONA  FY_1998 0.512511586
     ARKANSAS  FY_1998 0.485002318
   CALIFORNIA  FY_1998 0.696569322
     COLORADO  FY_1998 0.351297291
     VIRGINIA  FY_2013 0.346652203
   WASHINGTON  FY_2013 0.215769738
WEST_VIRGINIA  FY_2013 0.219831256
    WISCONSIN  FY_2013 0.226368331
      WYOMING  FY_2013 0.153766717
AVERAGE_STATE  FY_2013 0.235787342")

rownames(data.basic) <- data.basic$STATE
Boxplot(value~variable, data=data.basic)

%%%%拍摄2:这次可能更幸运?

library(dplyr)
library(car)
set.seed(4)
states <- c("ALABAMA","ALASKA","ARIZONA","ARKANSAS","CALIFORNIA","COLORADO","VIRGINIA","WASHINGTON")
data.basic <- data.frame(STATE=rep(states,2),
                         variable=rep(c("FY_1998","FY_2013"),each=length(states)),
                         value=rexp(2*length(states)))
## This does not work:    
##Boxplot(value~variable, labels=data.basic$STATE, data=data.basic)
yl <- range(data.basic$value)
vars <- levels(data.basic$variable)
npl <- length(vars)
lapply(1:npl, function(ii) {
    dat <- data.basic %>% filter(variable==vars[ii])
    rownames(dat) <- dat$STATE
    Boxplot(value~variable, data=dat, ylim=yl, add=(ii!=1))
})