我有一个简化的数据框:
x <- data.frame(
condition = c("ctrl", "ctrl", "ctrl", "ctrl", "exp", "exp", "exp", "exp"),
type = c(1, 2, 3, 4, 1, 2, 3, 4),
value = c("x", "x", "x", "x", "x", "x", "x", "x")
)
# condition type value
# 1 ctrl 1 x
# 2 ctrl 2 x
# 3 ctrl 3 x
# 4 ctrl 4 x
# 5 exp 1 x
# 6 exp 2 x
# 7 exp 3 x
# 8 exp 4 x
我想创建一个新列,即“类型1的值”乘以“类型2的值”。有没有人对最好的方法有任何建议?
答案 0 :(得分:1)
我确定其他人会有一个更优雅的解决方案,但我的建议是:将数据从长格式传播到宽格式,将所需列数相乘,然后将其全部收回长格式。
# Load packages
library(dplyr)
library(tidyr)
# Make dataframe
df <- data_frame(condition = rep(c('ctrl', 'exp'), each = 4),
type = as.character(rep(1:4, times = 2)),
value = rnorm(8))
# Print df
df
#> # A tibble: 8 × 3
#> condition type value
#> <chr> <chr> <dbl>
#> 1 ctrl 1 0.38735743
#> 2 ctrl 2 0.04950654
#> 3 ctrl 3 0.23559332
#> 4 ctrl 4 -0.02618723
#> 5 exp 1 0.77968387
#> 6 exp 2 -1.28652883
#> 7 exp 3 0.99731983
#> 8 exp 4 -0.28059754
# Process df
df_2 <- df %>%
# Retain types 1 and 2
filter(type == 1 | type == 2) %>%
# Spread the type column
spread(key = type,
value = value) %>%
# Multiply values in type `1` and `2`
mutate(`1 * 2` = `1` * `2`) %>%
# Gather the types back together
# (omiting condition and `1 * 2` from the gather)
gather(key = type,
value = value,
-c(`1 * 2`, condition)) %>%
# Reorder columns
select(condition, type, value, `1 * 2`)
# Print df_2
df_2
#> # A tibble: 4 × 4
#> condition type value `1 * 2`
#> <chr> <chr> <dbl> <dbl>
#> 1 ctrl 1 0.38735743 0.01917673
#> 2 exp 1 0.77968387 -1.00308578
#> 3 ctrl 2 0.04950654 0.01917673
#> 4 exp 2 -1.28652883 -1.00308578
如果您想将它们全部重新组合在一起,那么您可以使用所有可能的类型&#39;,然后加入这两个数据帧。
# Join df_2 and df
df_3 <- df %>%
left_join(df_2)
#> Joining, by = c("condition", "type", "value")
# Print df_3
df_3
#> # A tibble: 8 × 4
#> condition type value `1 * 2`
#> <chr> <chr> <dbl> <dbl>
#> 1 ctrl 1 0.38735743 0.01917673
#> 2 ctrl 2 0.04950654 0.01917673
#> 3 ctrl 3 0.23559332 NA
#> 4 ctrl 4 -0.02618723 NA
#> 5 exp 1 0.77968387 -1.00308578
#> 6 exp 2 -1.28652883 -1.00308578
#> 7 exp 3 0.99731983 NA
#> 8 exp 4 -0.28059754 NA