如何根据R中列的最小值选择具有dplyr / tidyvese的列

时间:2018-10-24 16:13:55

标签: r dplyr data-science

我有一个每点一定数量的Landcoverpixel计数的数据集。

    species_distr <- data.frame(structure(list(Point = c(101, 102, 103, 104, 105, 106), `Herbaceous cover` = c(NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), `Tree or shrub cover` = c(NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), `Cropland, irrigated or post-flooding` = c(NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), `Mosaic cropland (>50%) / natural vegetation (tree, shrub, herbaceous cover) (<50%)` = c(NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_
), `Mosaic natural vegetation (tree, shrub, herbaceous cover) (>50%) / cropland (<50%)` = c(NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), `Tree cover, broadleaved, evergreen, closed to open (>15%)` = c(NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_
), `Tree cover, broadleaved, deciduous, closed to open (>15%)` = c(NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), `Tree cover, broadleaved, deciduous, closed (>40%)` = c(NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_
), `Tree cover, broadleaved, deciduous, open (15-40%)` = c(NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_
), `Tree cover, needleleaved, evergreen, closed to open (>15%)` = c(NA, 
NA, 1.73725490196078, NA, NA, NA), `Tree cover, needleleaved, evergreen, closed (>40%)` = c(NA, 
NA, 0L, NA, NA, NA), `Tree cover, needleleaved, evergreen, open (15-40%)` = c(NA, 
NA, 0L, NA, NA, NA), `Tree cover, needleleaved, deciduous, closed to open (>15%)` = c(2059.57647058824, 
544, 2209.63529411765, 1226.7568627451, 1722.34901960784, 1359.10196078432
), `Tree cover, needleleaved, deciduous, closed (>40%)` = c(NA, 
NA, 0L, 0L, NA, NA), `Tree cover, needleleaved, deciduous, open (15-40%)` = c(NA, 
NA, 0L, 0L, NA, NA), `Tree cover, mixed leaf type (broadleaved and needleleaved)` = c(NA, 
NA, 1.96470588235294, 0, NA, NA), `Mosaic tree and shrub (>50%) / herbaceous cover (<50%)` = c(NA, 
NA, 0, 2, NA, NA), `Mosaic herbaceous cover (>50%) / tree and shrub (<50%)` = c(NA, 
NA, 0L, NA, NA, NA), Shrubland = c(NA, NA, 0, NA, NA, NA), `Shrubland evergreen` = c(NA, 
NA, 0L, NA, NA, NA), `Shrubland deciduous` = c(NA, NA, 0, NA, 
NA, NA), Grassland = c(NA, NA, 0L, NA, NA, NA), `Lichens and mosses` = c(NA, 
NA, 0L, NA, NA, NA), `Sparse vegetation (tree, shrub, herbaceous cover) (<15%)` = c(NA, 
NA, 0, NA, NA, NA), `Sparse tree (<15%)` = c(NA, NA, 0L, NA, 
NA, NA), `Sparse shrub (<15%)` = c(NA, NA, 0L, NA, NA, NA), `Sparse herbaceous cover (<15%)` = c(NA, 
NA, 0L, NA, NA, NA), `Tree cover, flooded, fresh or brakish water` = c(NA, 
NA, 0, NA, NA, NA), `Tree cover, flooded, saline water` = c(NA, 
NA, 0L, NA, NA, NA), `Shrub or herbaceous cover, flooded, fresh/saline/brakish water` = c(NA, 
NA, 0, NA, NA, NA), `Urban areas` = c(NA, NA, 0L, NA, NA, NA), 
    `Bare areas` = c(NA, NA, 0, NA, NA, NA), `Consolidated bare areas` = c(NA, 
    NA, 0L, NA, NA, NA), `Unconsolidated bare areas` = c(NA, 
    NA, 0L, NA, NA, NA), `Water bodies` = c(NA, NA, 4.73725490196078, 
    NA, NA, NA)), row.names = c(NA, -6L), class = c("tbl_df", 
"tbl", "data.frame")))

如果要排除所有值不超过例如50的列。 我的快速而肮脏的解决方案如下:

c <- NULL
for (i in 2:length(species_distr)) {
  if (max(na.omit(species_distr[,i])) > 50) {
    c <- c(c, i)
  }
}
species_distr_plot <- species_distr[,c(1,c)]

如何使用dplyr / tidyverse实现此目的?我到目前为止尝试过:

  %>%
select_if(na.omit(max(.)) > 50)

1 个答案:

答案 0 :(得分:1)

我们可能需要{ "specialcol":"specialvalue", "Value":{ "col1": "specialvalue", "col2": "someval", "col3": "someval" }, "specialcol":"specialvalue2", "Value":{ "col1": "specialvalue2", "col2": "someval2", "col3": "someval2" } }

any