我有一个与此相似的数据框
session <- c(rep(34,8), rep(28,8))
trial_index <- c(rep(2,4),rep(5,4),rep(6,4),rep(8,4))
label <- c(rep(c("a","b","c","d"),4))
time <- c(10,2,7,40,4,3,6,20,5,3,5,15,4,2,3,17)
data <-data.frame(session, trial_index,label,time)
我想做的是为每个试验索引和会话更改“ d”的值。每个d的值应为d = d-c-b-a。例如,对于第34节试验索引2,d应该为40-7-2-10。我不需要更改b和c的值。我不知道如何执行此操作,因此我们将不胜感激。谢谢!
答案 0 :(得分:4)
一种方法是重新排列数据,使标签成为每种session-trial_index
组合的单独列。然后,d的计算是基于简单列的减法。发布此信息后,您可以将数据转换回原始格式。
下面是相同的示例实现
library(tidyr) # To rearrange the data
library(dplyr) # To do the subtraction
data <- tidyr::spread(data, key = label, value = time) %>% # Makes labels as columns
dplyr::mutate(d = d - c - b - a) %>%
tidyr::gather(key = label, value = time,-session,-trial_index) # Convert back
此代码的输出是
| session| trial_index|label | time|
|-------:|-----------:|:-----|----:|
| 34| 2|a | 10|
| 34| 2|b | 2|
| 34| 2|c | 7|
| 34| 2|d | 21|
| 34| 5|a | 4|
| 34| 5|b | 3|
| 34| 5|c | 6|
| 34| 5|d | 7|
| 28| 6|a | 5|
| 28| 6|b | 3|
| 28| 6|c | 5|
| 28| 6|d | 2|
| 28| 8|a | 4|
| 28| 8|b | 2|
| 28| 8|c | 3|
| 28| 8|d | 8|
答案 1 :(得分:1)
也许像这样:
newdf <- data[, list(new=time[label=='d'] - time[label=='c'] - time[label=='b'] - time[label=='a']) ,list(session, trial_index)]
data <- merge(data,newdf)
data[label=='d',time := new]
data[,new := NULL]
请注意,由于合并,数据将重新排序,因此,如果需要保留此数据,则只需先添加索引,然后重新排序即可:
data[,index:=1:nrow(data)]
newdf <- data[, list(new=time[label=='d'] - time[label=='c'] - time[label=='b'] - time[label=='a']) ,list(session, trial_index)]
data <- merge(data,newdf)
data[label=='d',time := new]
data[,new := NULL]
data <- data[order(index),]
data[,index:=NULL]
答案 2 :(得分:1)
也许有些复杂的方法,但是你去了。
1)向下移动该列,以便获得d旁边的a,b,c值。
func = lambda *x: x
modules = [y for x in map(func,l1,l2,l3) for y in x]
感谢大卫建议在一线进行突变!
2)对等于d的标签进行计算,而其余部分保持不变。
data <- data %>% mutate(time2 = lag(time), time3 = lag(time2), time4 = lag(time3))
3)删除之前创建的三个不需要的列:
data <- transform(data, time = ifelse(label == 'd', time-time2-time3-time4, time))
输出:
data <- data[-c(5, 6, 7)]
答案 3 :(得分:1)
使用data.table的解决方案
library(data.table)
## Just subset everything from "d" (as the order doesn't really matter) by group
d <- setDT(data)[, Reduce(`-`, rev(time)), by = .(session, trial_index)]$V1
## Insert the results only for "d"
data[label == "d", time := d]
data
# session trial_index label time
# 1: 34 2 a 10
# 2: 34 2 b 2
# 3: 34 2 c 7
# 4: 34 2 d 21
# 5: 34 5 a 4
# 6: 34 5 b 3
# 7: 34 5 c 6
# 8: 34 5 d 7
# 9: 28 6 a 5
# 10: 28 6 b 3
# 11: 28 6 c 5
# 12: 28 6 d 2
# 13: 28 8 a 4
# 14: 28 8 b 2
# 15: 28 8 c 3
# 16: 28 8 d 8