我对R还是比较陌生,并且一直在努力解决潜在的非常简单的问题。
我的数据具有以相似方式命名的多个列。这是一个示例数据:
df = data.frame(PPID = 1:50,
time1 = sample(c(0,1), 50, replace = TRUE),
time2 = sample(c(0,1), 50, replace = TRUE),
time3 = sample(c(0,1), 50, replace = TRUE),
condition1 = sample(c(0:3), 50, replace = TRUE),
condition2 = sample(c(0:3), 50, replace = TRUE))
在我的实际数据中,我有更多列-时间约50列,条件约10列。
我想乘以星期列和条件列,例如在该示例数据中,它应该给我6个额外的列,例如:time1_condition1,time1_condition2,time2_condition1,time2_condition2,time3_condition1,time3_condition2。
我尝试了this thread中建议的解决方案,但是这些解决方案不起作用(大概是因为我不了解mapply / apply是如何工作的,并且没有进行适当的更改)-它给了我错误消息,即更长的参数不是短的长度的倍数。
任何帮助将不胜感激!
答案 0 :(得分:2)
#Get all the columns with "time" columns
time_cols <- grep("^time", names(df))
#Get all the columns with "condition" column
condition_cols <- grep("^condition", names(df))
#Multiply each "time" columns with all the condition columns
# and creating a new dataframe
new_df <- do.call("cbind", lapply(df[time_cols] , function(x) x *
df[condition_cols]))
#Combine both the dataframes
complete_df <- cbind(df,new_df)
我们还可以使用expand.grid
new_names <- do.call("paste0",
expand.grid(names(df)[condition_cols], names(df)[time_cols]))
colnames(complete_df)[7:12] <- new_names
答案 1 :(得分:2)
这里是tidyverse
的替代方式
library(tidyverse)
idx.time <- grep("time", names(df), value = T)
idx.cond <- grep("condition", names(df), value = T)
bind_cols(
df,
map_dfc(transpose(expand.grid(idx.time, idx.cond, stringsAsFactors = F)),
~setNames(data.frame(df[, .x$Var1] * df[, .x$Var2]), paste(.x$Var1, .x$Var2, sep = "_"))))
# PPID time1 time2 time3 condition1 condition2 time1_condition1
#1 1 1 0 1 3 0 3
#2 2 0 1 1 0 1 0
#3 3 0 1 1 0 2 0
#4 4 0 0 1 0 3 0
#5 5 0 0 0 0 3 0
#...
说明:expand.grid
创建idx.time
和idx.cond
的所有成对组合。 transpose
由内而外翻转list / data.frame并返回list
,类似于apply(..., 1, as.list)
; map_dfc
然后对该list
的每个元素进行操作,并按列绑定结果。
答案 2 :(得分:1)
使用
library(tidyverse)
a = df[grep("time",names(df))]
b = df[grep("condition",names(df))]
我们可以做到:
map(a,~.x*b)%>%
bind_cols()%>%
set_names(paste(rep(names(a),each=ncol(b)),names(b),sep="_"))
或者我们可以
cross2(a,b)%>%
map(lift(`*`))%>%
set_names(paste(rep(names(a),each=ncol(b)),names(b),sep="_"))%>%
data.frame()
time1_condition1 time2_condition1 time3_condition1 time1_condition2 time2_condition2 time3_condition2
1 3 0 3 2 0 2
2 3 3 0 1 1 0
3 0 0 0 0 0 0
4 3 3 0 0 0 0
5 0 0 2 0 0 1
6 0 0 1 0 0 1
7 2 2 0 0 0 0