根据数据帧中的某些条件更改一个值

时间:2019-07-19 11:15:17

标签: r dataframe

我有一个与此相似的数据框

session <- c(rep(34,8), rep(28,8))
trial_index <- c(rep(2,4),rep(5,4),rep(6,4),rep(8,4))
label <- c(rep(c("a","b","c","d"),4))
time <- c(10,2,7,40,4,3,6,20,5,3,5,15,4,2,3,17)
data <-data.frame(session, trial_index,label,time)

我想做的是为每个试验索引和会话更改“ d”的值。每个d的值应为d = d-c-b-a。例如,对于第34节试验索引2,d应该为40-7-2-10。我不需要更改b和c的值。我不知道如何执行此操作,因此我们将不胜感激。谢谢!

4 个答案:

答案 0 :(得分:4)

一种方法是重新排列数据,使标签成为每种session-trial_index组合的单独列。然后,d的计算是基于简单列的减法。发布此信息后,您可以将数据转换回原始格式。

下面是相同的示例实现

library(tidyr) # To rearrange the data
library(dplyr) # To do the subtraction

data <- tidyr::spread(data, key = label, value = time) %>% # Makes labels as columns
  dplyr::mutate(d = d - c - b - a) %>%
  tidyr::gather(key = label, value = time,-session,-trial_index) # Convert back

此代码的输出是

| session| trial_index|label | time|
|-------:|-----------:|:-----|----:|
|      34|           2|a     |   10|
|      34|           2|b     |    2|
|      34|           2|c     |    7|
|      34|           2|d     |   21|
|      34|           5|a     |    4|
|      34|           5|b     |    3|
|      34|           5|c     |    6|
|      34|           5|d     |    7|
|      28|           6|a     |    5|
|      28|           6|b     |    3|
|      28|           6|c     |    5|
|      28|           6|d     |    2|
|      28|           8|a     |    4|
|      28|           8|b     |    2|
|      28|           8|c     |    3|
|      28|           8|d     |    8|

答案 1 :(得分:1)

也许像这样:

newdf <- data[, list(new=time[label=='d'] - time[label=='c'] - time[label=='b'] - time[label=='a']) ,list(session, trial_index)]
data <- merge(data,newdf)
data[label=='d',time := new]
data[,new := NULL]

请注意,由于合并,数据将重新排序,因此,如果需要保留此数据,则只需先添加索引,然后重新排序即可:

data[,index:=1:nrow(data)]
newdf <- data[, list(new=time[label=='d'] - time[label=='c'] - time[label=='b'] - time[label=='a']) ,list(session, trial_index)]
data <- merge(data,newdf)
data[label=='d',time := new]
data[,new := NULL]
data <- data[order(index),]
data[,index:=NULL]

答案 2 :(得分:1)

也许有些复杂的方法,但是你去了。

1)向下移动该列,以便获得d旁边的a,b,c值。

func = lambda *x: x
modules = [y for x in map(func,l1,l2,l3) for y in x]

感谢大卫建议在一线进行突变!

2)对等于d的标签进行计算,而其余部分保持不变。

data <- data %>% mutate(time2 = lag(time), time3 = lag(time2), time4 = lag(time3))

3)删除之前创建的三个不需要的列:

data <- transform(data, time = ifelse(label == 'd', time-time2-time3-time4, time))

输出:

data <- data[-c(5, 6, 7)]

答案 3 :(得分:1)

使用data.table的解决方案

library(data.table)

## Just subset everything from "d" (as the order doesn't really matter) by group
d <- setDT(data)[, Reduce(`-`, rev(time)), by = .(session, trial_index)]$V1

## Insert the results only for "d" 
data[label == "d", time := d]

data
#     session trial_index label time
#  1:      34           2     a   10
#  2:      34           2     b    2
#  3:      34           2     c    7
#  4:      34           2     d   21
#  5:      34           5     a    4
#  6:      34           5     b    3
#  7:      34           5     c    6
#  8:      34           5     d    7
#  9:      28           6     a    5
# 10:      28           6     b    3
# 11:      28           6     c    5
# 12:      28           6     d    2
# 13:      28           8     a    4
# 14:      28           8     b    2
# 15:      28           8     c    3
# 16:      28           8     d    8