R:从因子向量中删除Na并将其转换为数字

时间:2014-07-02 09:20:39

标签: r vector

我有这样的矢量

 [1] "72.82947"  NA          NA          NA          NA          NA          "66.00949"  NA         
  [9] NA          "0.133434"  NA          NA          "2.265083"  NA          NA          NA         
 [17] " 0"        NA          NA          NA          NA          NA          NA          NA         
 [25] "0.311346"  NA          NA          " 0"        NA          NA          NA          NA         
 [33] NA          NA          NA          NA          NA          "0.7024582" NA          NA         
 [41] NA          NA          NA          NA          NA          NA          "3.543211"  NA         
 [49] NA          "5.779669"  NA          "4.617021"  NA          "1.682751"  NA          NA         
 [57] NA          NA          NA          "0.214977"  NA          NA          NA          "1.573152" 

以前的许多问题(How to remove all the NA from a Vector?R script - removing NA values from a vectorR: removing NAs in numerical vectors)和我使用过的手册

vector.test[!is.na(exo.1.4.mad)]

vector.test[na.omit(exo.1.4.mad)]

但它们都不起作用。我总是用NA拿回相同的矢量。然后我尝试手动对矢量进行子集化,指示我有值的位置,并尝试将其转换为数值:

as.numeric(as.character(exo.1.4.mad.values))

但这也不起作用,并且通过强制引入了NA。在这一点上,我想我错过了关于原始载体的格式/类的一些内容。

有什么建议吗?


我为我的对象添加了更多信息:

  

的typeof(exo.1.4.mad)   1"整数"

     

dput(exo.1.4.mad)   结构(c(33L,37L,37L,37L,37L,37L,31L,37L,37L,4L,   37L,37L,20L,37L,37L,37L,1L,37L,37L,37L,37L,37L,37L,   37L,8L,37L,37L,1L,37L,37L,37L,37L,37L,37L,37L,37L,   37L,11L,37L,37L,37L,37L,37L,37L,37L,37L,24L,37L,37L,   29L,37L,26L,37L,19L,37L,37L,37L,37L,37L,6L,37L,37L,   37L,18L,37L,2L,37L,1L,37L,14L,37L,25L,37L,27L,37L,   10L,37L,3L,37L,37L,35L,37L,37L,28L,37L,37L,37L,32L,   37L,12L,37L,30L,37L,37L,37L,37L,37L,36L,37L,37L,7L,   37L,13L,37L,37L,37L,37L,9L,37L,37L,37L,21L,37L,37L,   37L,37L,37L,37L,15L,37L,37L,37L,34L,37L,23L,37L,37L,   37L,37L,37L,22L,37L,37L,37L,16L,37L,37L,17L,37L,5L,   37L),。标签= c(" \" 0 \""," \" 0.044478 \"",&# 34; \" 0.1103672 \""," \" 0.133434 \"",   " \" 0.1893487 \""," \" 0.214977 \""," \" 0.2506812 \""," \" 0.311346 \"",   " \" 0.3219932 \""," \" 0.409485 \""," \" 0.7024582 \""," \" 0.7029872 \"",   " \" 0.7983231 \""," \" 1.104537 \""," \" 1.170474 \""," \" 1.2355 \"",   " \" 1.255681 \""," \" 1.573152 \""," \" 1.682751 \""," \" 2.265083 \"",   " \" 2.491765 \""," \" 2.566038 \""," \" 2.731105 \""," \" 3.543211 \"",   " \" 4.42271 \""," \" 4.617021 \""," \" 5.235322 \""," \" 5.340412 \"",   " \" 5.779669 \""," \" 5.847934 \""," \" 66.00949 \""," \" 67.9525 \"",   " \" 72.82947 \""," \" 75.2123 \""," \" 8.347973 ""," \" 9.832462 \"",   " NA"),class =" factor")

这让我更加困惑!

3 个答案:

答案 0 :(得分:2)

尝试:

exo1 <- as.numeric(gsub("[^.0-9]+","",exo.1.4.mad))
exo1[!is.na(exo1)]
 #[1] 72.8294700 66.0094900  0.1334340  2.2650830  0.0000000  0.3113460
 #[7]  0.0000000  0.7024582  3.5432110  5.7796690  4.6170210  1.6827510
 #[13]  0.2149770  1.5731520  0.0444780  0.0000000  1.1045370  4.4227100
 #[19]  5.2353220  0.4094850  0.1103672  8.3479730  5.3404120 67.9525000
 #[25]  0.7029872  5.8479340  9.8324620  0.2506812  0.7983231  0.3219932
 #[31]  2.4917650  1.1704740 75.2123000  2.7311050  2.5660380  1.2355000
 #[37]  1.2556810  0.1893487

解释

 [^.0-9]+ ## select everything else other than digits and dot and remove it.

答案 1 :(得分:1)

这对我有用:

> myVec <- c(NA, "1", "2", NA)
> myVec
[1] NA  "1" "2" NA 
> as.numeric(myVec[!is.na(myVec)])
[1] 1 2

这对你有帮助吗?

答案 2 :(得分:1)

您的数据存在的问题是,您的“NA”并非真正NA,因为R定义了它们,而只是字符。因此is.na在这里不起作用。只需做

exo.1.4.mad[exo.1.4.mad != "NA"]