R - 如何仅对某些行进行多重处理

时间:2017-04-17 11:49:45

标签: r dataframe multiplication

我有一个简化的数据框:

x <- data.frame(
  condition = c("ctrl", "ctrl", "ctrl", "ctrl", "exp", "exp", "exp", "exp"),
  type = c(1, 2, 3, 4, 1, 2, 3, 4),
  value = c("x", "x", "x", "x", "x", "x", "x", "x")
)
#   condition type value
# 1      ctrl    1     x
# 2      ctrl    2     x
# 3      ctrl    3     x
# 4      ctrl    4     x
# 5       exp    1     x
# 6       exp    2     x
# 7       exp    3     x
# 8       exp    4     x

我想创建一个新列,即“类型1的值”乘以“类型2的值”。有没有人对最好的方法有任何建议?

1 个答案:

答案 0 :(得分:1)

我确定其他人会有一个更优雅的解决方案,但我的建议是:将数据从长格式传播到宽格式,将所需列数相乘,然后将其全部收回长格式。

# Load packages
library(dplyr)
library(tidyr)

# Make dataframe
df <- data_frame(condition = rep(c('ctrl', 'exp'), each = 4),
                 type = as.character(rep(1:4, times = 2)),
                 value = rnorm(8))

# Print df
df
#> # A tibble: 8 × 3
#>   condition  type       value
#>       <chr> <chr>       <dbl>
#> 1      ctrl     1  0.38735743
#> 2      ctrl     2  0.04950654
#> 3      ctrl     3  0.23559332
#> 4      ctrl     4 -0.02618723
#> 5       exp     1  0.77968387
#> 6       exp     2 -1.28652883
#> 7       exp     3  0.99731983
#> 8       exp     4 -0.28059754

# Process df 
df_2 <- df %>%
    # Retain types 1 and 2
    filter(type == 1 | type == 2) %>% 
    # Spread the type column
    spread(key = type,
           value = value) %>%
    # Multiply values in type `1` and `2`
    mutate(`1 * 2` = `1` * `2`) %>%
    # Gather the types back together 
    # (omiting condition and `1 * 2` from the gather)
    gather(key = type,
           value = value,
           -c(`1 * 2`, condition)) %>%
    # Reorder columns
    select(condition, type, value, `1 * 2`) 

# Print df_2
df_2
#> # A tibble: 4 × 4
#>   condition  type       value     `1 * 2`
#>       <chr> <chr>       <dbl>       <dbl>
#> 1      ctrl     1  0.38735743  0.01917673
#> 2       exp     1  0.77968387 -1.00308578
#> 3      ctrl     2  0.04950654  0.01917673
#> 4       exp     2 -1.28652883 -1.00308578

如果您想将它们全部重新组合在一起,那么您可以使用所有可能的类型&#39;,然后加入这两个数据帧。

# Join df_2 and df
df_3 <- df %>%
    left_join(df_2)
#> Joining, by = c("condition", "type", "value")

# Print df_3
df_3
#> # A tibble: 8 × 4
#>   condition  type       value     `1 * 2`
#>       <chr> <chr>       <dbl>       <dbl>
#> 1      ctrl     1  0.38735743  0.01917673
#> 2      ctrl     2  0.04950654  0.01917673
#> 3      ctrl     3  0.23559332          NA
#> 4      ctrl     4 -0.02618723          NA
#> 5       exp     1  0.77968387 -1.00308578
#> 6       exp     2 -1.28652883 -1.00308578
#> 7       exp     3  0.99731983          NA
#> 8       exp     4 -0.28059754          NA