Question

我试图使用complete.cases清除文件中的NA。

我一直在使用本网站的帮助，但它不起作用，我不再确定我是否有可能做的事情。

juulDataRaw <- read.csv(url("http://blah"));
juulDataRaw[complete.cases(juulDataRaw),]

我试过这个（这里的一个例子）

dog<-structure(list(Sample = 1:6
,gene = c("ENSG00000208234","ENSG00000199674","ENSG00000221622","ENSG00000207604","ENSG00000207431","ENSG00000221312")
,hsap = c(0,0,0,0,0,0)
,mmul = c(NA,2,NA,NA,NA,1)
,mmus = c(NA,2,NA,NA,NA,2)
,rnor = c(NA,2,NA,1,NA,3)
,cfam = c(NA,2,NA,2,NA,2))
,.Names = c("gene", "hsap", "mmul", "mmus", "rnor", "cfam"), class = "data.frame", row.names = c(NA, -6L))
dog[complete.cases(dog),]

并且有效。

我可以这样做吗？
两者有什么区别？
Aren他们俩只是数据框吗？

Answer 1

您在数字值周围有引号，因此它们作为因子读入。这使得“NA”只是另一个字符串而不是R NA。

> juulDataRaw[] <- lapply(juulDataRaw, as.character)
> juulDataRaw[] <- lapply(juulDataRaw, as.numeric)
Warning messages:
1: In lapply(juulDataRaw, as.numeric) : NAs introduced by coercion
2: In lapply(juulDataRaw, as.numeric) : NAs introduced by coercion
3: In lapply(juulDataRaw, as.numeric) : NAs introduced by coercion
> juulDataRaw[complete.cases(juulDataRaw),]
       age height igf1 weight
55    6.00  111.6   98   19.1
57    6.08  116.7  242   21.7
61    6.26  120.3  196   24.7
66    6.40  115.5  179   19.6
69    6.42  115.6  126   20.6
71    6.43  116.1  142   20.2
80    6.61  130.3  236   28.0
81    6.63  122.2  148   21.6
83    6.70  126.2  174   26.1
84    6.72  125.6  136   22.6
85    6.72  121.0  164   24.4
snipped remaining output.....

R＆＃34; complete.cases＆＃34;一个不在另一个上工作？

1 个答案: