Question

我想在其中一列中选择给定值的一半数据帧。换句话说，从下面给出的数据框中，我需要提取Y列中给定值的一半行：

DF:
 id1  column Y   value
9830     A         6 
7609     A         0 
9925     B         0 
9922     B         5 
9916     B         6
9917     B         8 
9914     C         2
9914     C         7
9914     C         7
9914     C         2
9914     C         9

新数据框应如下所示：

  NEW DF:
     id1  column Y   value
    9830     A         6 
    9925     B         0 
    9922     B         5 
    9914     C         2
    9914     C         7

此外，知道选择列Y的所有行datefram DF的随机一半的解决方案（例如，不选择前50％）将是有帮助的。

感谢任何帮助。谢谢！

Answer 1

假设您希望每组行的前半部分具有相同的use std::fs; use std::path::Path; fn print_filetimes(path: &Path) -> Result<(), std::io::Error> { for entry in fs::read_dir(&path)? { let time = entry.and_then(|e| e.metadata()).map(|m| m.accessed())?; println!("{:?}", time); } Ok(()) } fn main() { let path = Path::new("."); match print_filetimes(path) { Ok(()) => (), Err(_) => panic!("will be handled"), } }值，对于我们向下舍入的奇数行，我们可以使用column Y中的filter：< / p>

dplyr

我们首先library(dplyr) df %>% group_by(`column Y`) %>% filter(row_number() <= floor(n()/2)) ##Source: local data frame [5 x 3] ##Groups: column Y [3] ## ## id1 column Y laclen ## <int> <fctr> <int> ##1 9830 A 6 ##2 9925 B 0 ##3 9922 B 5 ##4 9914 C 2 ##5 9914 C 7 group_by（请注意列名称包含空格后的引号），然后使用column Y仅保留filter较少的行小于或等于row_number给出的总行数除以n()（并向下舍入2）。

要在每个组中随机选择50％的行，请使用floor生成要保留的行号，并sample匹配要保留的行号：

%in%

在一列中选择给定值的一半数据帧

1 个答案: