R Dataframe问题阻止了正常性测试

时间:2018-04-27 22:27:27

标签: r dataframe

我已经阅读了我的.CSV,然后使用多种方法将文件转换为数据框,包括:

df<-read.csv('cdSH2015Fall.csv', dec = ".", na.strings = c("na"), header=TRUE, 
row.names=NULL, stringsAsFactors=F)


df<-as.data.frame(lapply(df, unlist)) # converted .csv to a a data.frame

str(df) # provides the structure of df. 
'data.frame':   72 obs. of  16 variables:
 $ trtGroup            : Factor w/ 68 levels "AANN","AAPN",..: 5 7 14 18 20 23 
27 33 37 48 ...
 $ cd                  : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
 $ PreviousExp         : Factor w/ 2 levels "Empty","Enriched": 2 1 2 2 2 2 1 
1 1 1 ...
 $ treatment           : Factor w/ 2 levels "NN","PN": 1 1 1 1 1 1 1 1 1 1 ...
 $ total.Area.DarkBlue.: num  827 1037 663 389 983 ...
 $ numberOfGroups      : int  1 1 1 1 1 1 1 1 1 1 ...
 $ totalGroupArea      : num  15.72 2.26 9.45 11.57 9.73 ...
 $ averageGrpArea      : num  15.72 2.26 9.45 11.57 9.73 ...
 $ proximityToPlants   : num  5.65 16.05 2.58 9.65 4.74 ...
 $ latFeed             : num  2 0.5 0 1 0 0 1 0.5 2 1 ...
 $ latBalloon          : num  6 2 2 NA 0 0.1 3 0.5 1 0.7 ...
 $ countChases         : int  5 8 16 4 16 21 18 11 14 28 ...
 $ chases              : int  95 87 67 923 636 96 1210 571 775 816 ...
 $ grpDiameter         : num  16.8 23.3 19.5 11.2 29.9 ...
 $ grpActiv            : num  4908 5164 4197 5263 5377 ...
 $ NND                 : num  0 11.88 8.98 3.6 9.8 ...

然后我以两种方式运行我的模型:

第一个选项。

 fit = t.test(df$proximityToPlants[which (df$cd==1 & 
    df$treatment == 'PN')], df$proximityToPlants[which 
    (df$cd==0 & df$treatment == 'PN')]
    )

第二个选项试图确保我有一个合适的数据框。

  • 子集数据然后创建矩阵。

     cdProximityToPlantsPN<-cdSH2015Fall$proximityToPlants[which (cdSH2015Fall$cd==1 & cdSH2015Fall$treatment == 'PN')]
    H2ProximityToPlantsPN<-cdSH2015Fall$proximityToPlants[which (cdSH2015Fall$cd==0 & cdSH2015Fall$treatment == 'PN')]
    
    cdProximityToPlantsNN<-cdSH2015Fall$proximityToPlants[which (cdSH2015Fall$cd==1 & cdSH2015Fall$treatment == 'NN')]
    H2ProximityToPlantsNN<-cdSH2015Fall$proximityToPlants[which (cdSH2015Fall$cd==0 & cdSH2015Fall$treatment == 'NN')]
    
  • 创建矩阵

    df<- 
      cbind(cdProximityToPlantsPN,H2ProximityToPlantsPN,cdProximityToPlantsNN, 
    H2ProximityToPlantsNN)
    mat <- sapply(df,unlist)
    fit=t.test(mat[,1],mat[,2], paired = F, var.equal = T)
    

然而,在使用以下内容评估异常值时,我仍然会遇到错误:

outlierTest(fit) # Bonferonni p-value for most extreme obs
Error in UseMethod("outlierTest") : 
  no applicable method for 'outlierTest' applied to an object of class 
 "htest"
qqPlot(fit, main="QQ Plot") #qq plot for studentized resid 
Error in order(x[good]) : unimplemented type 'list' in 'orderVector1'
leveragePlots(fit) # leverage plots
Error in formula.default(model) : invalid formula

我知道问题必须与我的数据结构有关。关于如何解决它的任何想法?

0 个答案:

没有答案