Question

假定具有以下情况：

    Statistic1       Condition1     Statistic2       Condition2         
      0.00001            Y             0.02              NA      
      0.03               Y             0.0001            NA         
      0.01               NA            0.001              Y       
     ..............

总共20.000行60列。假设您要在“条件*”列中将NA / Y替换为0 如果相对统计*列中的值小于0.05。支票将涉及成对的列Statistic * -Condition *。如何在大量的列和行上执行此操作？

提前谢谢

B

Answer 1

一种tidyverse可能是：

df %>%
 mutate_at(vars(matches("Condition")), list(~ (. = 1))) %>%
 rowid_to_column() %>%
 gather(var, val, -rowid) %>%
 arrange(rowid) %>%
 group_by(rowid, pair = parse_number(var)) %>%
 mutate(val = (lag(val, default = 0) < 0.05) * val) %>%
 ungroup() %>%
 select(-pair) %>%
 spread(var, val) %>%
 select(-rowid)

  Condition1 Condition2 Statistic1 Statistic2
       <dbl>      <dbl>      <dbl>      <dbl>
1          1          0    0.00001     1     
2          1          1    0.03        0.0001
3          1          1    0.01        0.001

首先，在这里为所有“条件”列分配1并创建行ID。其次，它执行从宽到长的数据转换，但不包括行ID。第三，它按行ID排列数据，并按行ID和由列中的数字组成的对进行分组。第四，它检查统计信息是否小于0.05。最后，它将数据恢复为原始格式，并删除冗余变量。

我使用了此示例数据，在其中添加了一个统计等于1的案例：

df <- read.table(text = "Statistic1       Condition1     Statistic2       Condition2         
0.00001            Y             1              NA      
0.03               Y             0.0001            NA         
0.01               NA            0.001              Y", 
                 header = TRUE,
                 stringsAsFactors = FALSE)

Answer 2

为每个列创建一个布尔值，然后在它们之间写和（＆）。这是一个简单的示例，其中我检查两列是否满足两个列中的数字必须大于三的条件。

# Creating data
df <- data.frame(a = c(1,2,3,4), b = c(2,2,3,2))

# Running conditions on both columns and storing results in a new column
df$c <- df$a>2 & df$b>2

如果要在一列中替换另一列，则可以执行以下操作。

# Creating data
df <- data.frame(a = c(1,2,3,4), b = c(2,2,3,2))

# If column a is above 2 column b is set to zero
df$b[df$a>2] <- 0

将来请提供示例数据和输出，以便我们提供帮助。

在成对的列上应用条件

2 个答案: