Question

数据集包含8列和49,048行原始数据上传到Dropbox https://www.dropbox.com/s/u9r01rw8cgoepax/sample.xlsx?dl=0

我使用以下代码来提取变量Fhas值超过100但在变量F中有许多缺少值的行

x = read_excel("file path")
x = x[x$F>100,]

> m = (x[x$F>100,])
> summary(m$F)
Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
100.1   137.0   244.6   375.0   443.5  2490.0   43570

Answer 1

事实确实如此。您可以检查F-s大部分是否缺失：

x <- readxl::read_excel("~/d/sample.xlsx")
summary(x$F)

给你

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
   0.00    0.00    0.00    3.55    0.00 2490.00   43570

如果您只选择x$F > 100，则会获得缺少F的NAs。如果选择which(x$F > 100)，则只能得到数字索引。所以

x[which(x$F > 100),]

为您提供了一个数据框子集，其中所有F > 100（并且没有丢失）。