Shapiro如何在data.frame中测试多列?并避免2个错误:在需要TRUE / FALSE的情况下,值相同且缺少值

时间:2019-01-22 15:47:23

标签: r na normal-distribution

我有一个这样的数据框:

head(Betula, 10)

  year start Start_DayOfYear  end End_DayOfYear duration DateMax Max_DayOfYear BetulaPollenMax SPI Jan.NAO Jan.AO
1 1997  <NA>              NA <NA>            NA       NA    <NA>            NA              NA  NA   -0.49  -0.46
2 1998  <NA>             143 <NA>           184       41    <NA>           146              42 361    0.39  -2.08
3 1999  <NA>             148 <NA>           188       40    <NA>           158              32 149    0.77   0.11
4 2000  <NA>             135 <NA>           197       62    <NA>           156             173 917    0.60   1.27
5 2001  <NA>             143 <NA>           175       32    <NA>           154             113 457    0.25  -0.96
  Jan.SO Feb.NAO Feb.AO Feb.SO Mar.NAO Mar.AO Mar.SO Apr.NAO Apr.AO Apr.SO DecJanFebMarApr.NAO DecJanFebMar.NAO
1    0.5    1.70   1.89    1.7    1.46   1.09   -0.4   -1.02   0.32   -0.6                0.14             0.43
2   -2.7   -0.11  -0.18   -2.0    0.87  -0.25   -2.4   -0.68  -0.04   -1.4                0.27             0.51
3    1.8    0.29   0.48    1.0    0.23  -1.49    1.3   -0.95   0.28    1.4                0.39             0.73
4    0.7    1.70   1.08    1.7    0.77  -0.45    1.3   -0.03  -0.28    1.2                0.49             0.62
5    1.0    0.45  -0.62    1.7   -1.26  -1.69    0.9    0.00   0.91    0.2               -0.28            -0.35
  DecJanFeb.NAO DecJan.NAO JanFebMarApr.NAO JanFebMar.NAO JanFeb.NAO FebMarApr.NAO FebMar.NAO MarApr.NAO
1          0.08      -0.73             0.41          0.89       0.61          0.71       1.58       0.22
2          0.38       0.63             0.12          0.38       0.14          0.03       0.38       0.10
3          0.89       1.19             0.09          0.43       0.53         -0.14       0.26      -0.36
4          0.57       0.01             0.76          1.02       1.15          0.81       1.24       0.37
5         -0.04      -0.29            -0.14         -0.19       0.35         -0.27      -0.41      -0.63
  DecJanFebMarApr.AO DecJanFebMar.AO DecJanFeb.AO DecJan.AO JanFebMarApr.AO JanFebMar.AO JanFeb.AO FebMarApr.AO
1               0.55            0.61         0.45     -0.27            0.71         0.84      0.72         1.10
2              -0.24           -0.29        -0.30     -0.37           -0.64        -0.84     -1.13        -0.16
3               0.08            0.04         0.54      0.58           -0.16        -0.30      0.30        -0.24
4              -0.15           -0.11         0.00     -0.54            0.41         0.63      1.18         0.12
5              -0.74           -1.15        -0.97     -1.14           -0.59        -1.09     -0.79        -0.47
  FebMar.AO MarApr.AO DecJanFebMarApr.SO DecJanFebMar.SO DecJanFeb.SO DecJan.SO JanFebMarApr.SO JanFebMar.SO
1      1.49      0.71               0.04            0.20         0.40     -0.25            0.30         0.60
2     -0.22     -0.15              -1.42           -1.43        -1.10     -0.65           -2.13        -2.37
3     -0.51     -0.61               1.38            1.38         1.40      1.60            1.38         1.37
4      0.32     -0.37               1.14            1.13         1.07      0.75            1.23         1.23
5     -1.16     -0.39               0.60            0.70         0.63      0.10            0.95         1.20
  JanFeb.SO FebMarApr.SO FebMar.SO MarApr.SO TmaxAprI TminAprI TmeanAprI RainfallAprI HumidityAprI SunshineAprI
1      1.10         0.23      0.65     -0.50     3.27    -3.86     -0.44         0.82         76.3         3.45
2     -2.35        -1.93     -2.20     -1.90     4.52    -3.28     -0.15         0.12         73.5         7.12
3      1.40         1.23      1.15      1.35     4.11    -3.86     -0.34         1.32         78.4         4.85
4      1.20         1.40      1.50      1.25     6.11    -1.31      1.93         0.80         71.9         4.20
5      1.35         0.93      1.30      0.55     1.46    -2.37     -1.04         2.83         84.4         1.21
  CloudAprI WindAprI SeeLevelPressureAprI TmaxAprII TminAprII TmeanAprII RainfallAprII HumidityAprII
1      6.30     5.26              1008.63     12.12      2.11       6.17          0.23          76.5
2      3.93     3.86              1022.39      5.57     -0.44       1.82          0.83          77.9
3      5.02     3.23              1007.09      0.20     -6.36      -3.23          2.63          82.5
4      6.15     5.13              1012.21      2.74     -4.88      -2.35          0.34          76.0
5      7.50     3.90              1009.50      6.75     -3.22       1.16          0.32          71.5
  SunshineAprII CloudAprII WindAprII SeeLevelPressureAprII TmaxAprIII TminAprIII TmeanAprIII RainfallAprIII
1          3.12       6.53      5.19               1024.31       7.35       0.33        3.37           0.33
2          2.41       6.85      3.70               1012.01       6.34       0.76        2.69           2.01
3          4.99       5.87      6.23               1019.66       8.65       0.73        4.23           0.70
4          6.63       5.17      5.84               1022.62       5.84      -1.81        2.02           0.00
5          6.11       4.82      3.92               1018.81       8.47       1.02        4.17           1.09
  HumidityAprIII SunshineAprIII CloudAprIII WindAprIII SeeLevelPressureAprIII TmaxDecI TminDecI TmeanDecI
1           75.0           3.73        6.40       4.08                1009.91    -0.90    -5.88     -3.67
2           83.5           1.52        7.31       4.66                1008.33     5.33     0.01      2.46
3           73.4           6.62        5.12       3.16                1017.01    -0.24    -6.93     -3.64
4           69.0           8.80        4.80       4.99                1021.18     4.67     1.86      2.79
5           72.7           5.33        5.41       4.27                1005.48     3.69    -1.43      1.65
  RainfallDecI HumidityDecI SunshineDecI CloudDecI WindDecI SeeLevelPressureDecI TmaxDecII TminDecII TmeanDecII
1         0.12         77.3         0.22      5.08     3.49              1003.15      7.99      0.77       4.10
2         1.10         73.5         0.04      6.29     5.21               999.94      0.24     -4.74      -2.67
3         2.41         82.3         0.00      6.70     4.92               998.64      1.22     -5.90      -2.05
4         3.13         88.1         0.00      7.97     4.00               997.82      2.76     -3.89      -0.54
5         1.60         79.1         0.07      5.44     5.76               996.35     10.82      4.36       6.90
  RainfallDecII HumidityDecII SunshineDecII CloudDecII WindDecII SeeLevelPressureDecII TmaxDecIII TminDecIII
1          1.90          71.3             0       4.96      5.55               1007.16       4.78      -2.12
2          4.34          82.2             0       7.03      6.06                998.02       2.07      -4.60
3          1.94          78.6             0       6.53      5.82               1008.33       2.09      -2.48
4          1.45          77.2             0       6.57      5.26               1005.11      -1.49      -8.37
5          1.15          66.6             0       5.74      5.47               1030.02       1.40      -7.34
  TmeanDecIII RainfallDecIII HumidityDecIII SunshineDecIII CloudDecIII WindDecIII SeeLevelPressureDecIII TmaxFebI
1        1.15           3.96          82.36              0        6.01       4.02                 991.60    -0.23
2       -0.51           4.10          81.18              0        6.67       3.91                 986.52     0.79
3       -0.61           1.97          81.27              0        6.21       5.53                 982.13     2.19
4       -5.28           1.26          79.64              0        6.11       4.22                1019.63     3.27
5       -3.45           1.19          82.18              0        6.20       4.77                1015.53     2.42
  TminFebI TmeanFebI RainfallFebI HumidityFebI SunshineFebI CloudFebI WindFebI SeeLevelPressureFebI TmaxFebII
1    -6.67     -3.57         0.84         84.3         1.11      6.81     5.35               990.51      2.97
2    -7.79     -4.49         2.31         72.2         1.88      4.73     4.53               990.39      3.31
3    -4.14     -1.77         0.42         73.3         1.29      6.02     5.57              1007.67      1.55
4    -2.48      0.04         2.28         77.0         0.46      6.84     4.29               982.97     -1.24
5    -3.52     -0.74         1.98         81.5         0.76      5.78     4.93              1008.29      6.71
  TminFebII TmeanFebII RainfallFebII HumidityFebII SunshineFebII CloudFebII WindFebII SeeLevelPressureFebII
1     -2.31      -0.10          1.44          82.2          1.07       6.45      4.42                980.59
2     -4.85      -0.99          3.84          75.0          2.54       5.91      5.05                999.98
3     -5.76      -2.44          2.89          75.3          0.40       6.95      5.82                990.44
4     -8.47      -4.65          3.33          83.1          0.63       6.55      4.95               1000.10
5     -0.25       3.01          1.38          66.1          1.16       6.18      6.28               1001.46
  TmaxFebIII TminFebIII TmeanFebIII RainfallFebIII HumidityFebIII SunshineFebIII CloudFebIII WindFebIII
1       0.05      -6.01       -3.35           4.60          83.50           1.29        6.58       4.71
2      -0.45      -7.43       -4.51           2.93          78.38           1.00        6.91       5.99
3       2.13      -4.51       -1.21           2.90          79.38           2.51        5.76       5.46
4       0.59      -3.79       -1.92           5.94          88.33           1.40        6.86       6.70
5      -2.68      -7.23       -5.05           1.39          83.88           1.13        7.41       5.69
  SeeLevelPressureFebIII TmaxJanI TminJanI TmeanJanI RainfallJanI HumidityJanI SunshineJanI CloudJanI WindJanI
1                 980.25     0.38    -5.57     -3.36         0.01         82.9         0.27      3.45     2.97
2                 997.71     4.29    -0.03      2.08         3.70         82.9         0.00      7.39     5.01
3                 988.45     1.02    -4.47     -1.87         2.22         82.3         0.00      6.94     4.29
4                 987.21     0.04    -6.28     -3.03         4.99         85.8         0.00      5.84     4.75
5                1023.84    -0.33    -5.11     -3.17         0.66         81.2         0.00      7.08     3.88
  SeeLevelPressureJanI TmaxJanII TminJanII TmeanJanII RainfallJanII HumidityJanII SunshineJanII CloudJanII
1              1023.71      0.09     -6.48      -2.50          4.29          86.5          0.01       7.23
2               984.57     -0.34     -6.49      -3.61          2.74          80.2          0.23       6.99
3              1004.06      0.32     -5.59      -3.03          5.28          83.3          0.00       6.68
4               983.42      8.38      1.46       4.97          0.64          69.3          0.10       6.13
5              1010.31      7.35      3.00       5.09          1.27          66.3          0.03       6.19
  WindJanII SeeLevelPressureJanII TmaxJanIII TminJanIII TmeanJanIII RainfallJanIII HumidityJanIII SunshineJanIII
1      5.42                998.88       5.66      -2.39        1.97           1.03          74.27           0.65
2      6.38               1011.44       3.84      -3.32       -0.37           0.70          73.55           0.55
3      6.24                980.15       4.33      -5.19       -0.59           2.23          76.64           0.69
4      6.44               1019.41       4.09      -2.67        0.05           2.18          71.73           0.42
5      6.74               1006.10       4.43      -0.86        1.58           1.91          80.09           0.20
  CloudJanIII WindJanIII SeeLevelPressureJanIII TmaxMarI TminMarI TmeanMarI RainfallMarI HumidityMarI
1        6.47       7.59                1004.59     2.83    -3.60     -0.72         2.14         79.9
2        5.25       4.72                1019.95    -5.31   -12.52     -9.52         2.28         72.6
3        5.34       4.65                1001.66    -0.70    -6.67     -4.47         1.39         81.0
4        5.85       4.83                1007.23     0.10    -7.91     -3.98         2.36         80.2
5        6.53       3.63                 992.53    -0.38    -4.59     -2.27         3.00         86.4
  SunshineMarI CloudMarI WindMarI SeeLevelPressureMarI TmaxMarII TminMarII TmeanMarII RainfallMarII HumidityMarII
1         0.85      6.77     6.64               986.96     -1.48     -8.43      -5.58          1.09          81.0
2         2.92      5.91     4.68              1013.17      6.53     -1.81       2.56          0.43          65.5
3         2.40      5.71     4.02              1014.62      0.53     -5.17      -2.90          5.20          82.8
4         0.91      7.02     5.87              1006.64      5.32     -0.94       1.23          1.11          74.4
5         0.19      7.82     4.49               999.35      1.60     -4.29      -1.89          0.95          79.3
  SunshineMarII CloudMarII WindMarII SeeLevelPressureMarII TmaxMarIII TminMarIII TmeanMarIII RainfallMarIII
1          2.12       5.51      3.93               1021.57       3.88      -1.95        0.55           1.42
2          2.25       6.29      6.11               1008.31       3.95      -2.46       -0.15           1.30
3          1.00       6.61      5.77               1006.63      -0.68      -6.60       -4.07           0.70
4          2.16       6.61      6.45               1003.23       5.49      -0.68        1.65           1.58
5          4.07       5.21      3.14               1017.24      -0.66      -7.21       -4.00           1.37
  HumidityMarIII SunshineMarIII CloudMarIII WindMarIII SeeLevelPressureMarIII
1          80.45           2.80        6.13       4.03                 995.31
2          72.09           3.98        5.99       5.14                1000.32
3          78.73           2.34        6.46       3.81                1005.67
4          74.64           2.85        6.54       6.34                1013.45
5          79.45           4.71        5.65       4.95                1010.47
 [ reached 'max' / getOption("max.print") -- omitted 5 rows ]

我想一次对所有列进行正常性测试。我尝试过

apply(x, shapiro.test) 
  

Betula_shapiro <-申请(Betula,shapiro.test)

FUN(X [[i]],...)中的错误:is.numeric(x)不为真

,它没有用。我也尝试过:

Betula <-apply(Betula [which(sapply(Betula,is.numeric))],2,shapiro.test)

FUN(newX [,i],...)中的错误:所有'x'值都相同

  

f <-function(x){if(diff(range(x))== 0)list()else shapiro.test(x)}

     

Betula <-apply(Betula [which(sapply(Betula,is.numeric))],2,f)

if(diff(range(x))== 0)list()else shapiro.test(x)错误:   缺少需要TRUE / FALSE的值

所以我做到了:

  

Betula_numerics_only <-Betula [which(sapply(Betula,is.numeric))]

     

选择至少3个不丢失值的列并对其应用shapiro.test

     

Betula_numerics_only_filled_columns <-Betula_numerics_only [which(apply(Betula_numerics_only,2,function(f)sum(!is.na(f))> = 3))]

     

Betula_shapiro <-apply(Betula_numerics_only_filled_columns,2,shapiro.test)

FUN(newX [,i],...)中的错误:所有'x'值都相同

您能帮我解决这个问题吗?

1 个答案:

答案 0 :(得分:0)

自从我在评论中谈论可读性以来,我觉得我也应该提供更具可读性的内容作为答案。

让我们创建一些伪数据:

data_test <- data.frame(matrix(rnorm(100, 10, 1), ncol = 5, byrow = T), stringsAsFactors = F)

让我们将shapiro.test应用于每一列

apply(data_test, 2, shapiro.test)

如果有非数字列:

让我们添加一个用于测试目的的哑字符列

data_test$non_numeric <- sample(c("hello", "hi", "good morning"), NROW(data_test), replace = T)

然后尝试再次应用测试

apply(data_test, 2, shapiro.test)

其结果是:

> apply(data_test, 2, shapiro.test)
Error: is.numeric(x) is not TRUE

为解决此问题,我们使用sapply仅选择数字列:

data_test[which(sapply(data_test, is.numeric))]

并将其与apply结合起来

apply(data_test[which(sapply(data_test, is.numeric))], 2, shapiro.test)

删除所有不适用的列:

data_test_numerics_only <- data_test[which(sapply(data_test, is.numeric))]

选择至少3个不缺失值的列,然后对它们应用shapiro.test:

data_test_numerics_only_filled_colums = data_test_numerics_only[which(apply(data_test_numerics_only, 2, function(f) sum(!is.na(f)) >= 3))]

apply(data_test_numerics_only_filled_colums, 2, shapiro.test)

我们将使它运行,让我们再试一次:)

删除非数字列

Betula_numerics <- Betula[which(sapply(Betula, is.numeric))]

删除少于3个值的列

Betula_numerics_filled <- Betula_numerics[which(apply(Betula_numerics, 2, function(f) sum(!is.na(f)) >= 3))]

删除方差为零的列

Betula_numerics_filled_not_constant <- Betula_numerics_filled [apply(Betula_numerics_filled , 2, function(f) var(f, na.rm = T) != 0)]

Shapiro.test并希望最好:)

apply(Betula_numerics_filled_not_constant, 2, shapiro.test)