Question

例如，我有一个包含许多行和列的数据框

treatment   gene1   gene2   gene3   …
A   0   3   0   …
A   0   0   0   …
A   0   0   0   …
A   1   1   0   …
A   0   0   0   …
B   0   1   1   …
B   0   5   2   …
B   0   0   3   …
B   0   0   0   …
…   …   …   …

我希望基于以下规则获得以下数据框：如果每种处理的每个基因的值为0，则该处理的该基因的值为0（例如，处理A的gene1），否则为1 （例如，治疗B的基因1）。因此，新的数据框将是下面的数据框。

treatment   gene1   gene2   gene3   …
A   1   1   0   …
B   0   1   1   …
…   …   …   …   …

非常感谢您的帮助。

Answer 1

使用def insertTime(initialTime): if "TBA" in initialTime: return ["TBA", "TBA"] startTime,endTime = initialTime.split("-") try: if "PM" in endTime: startTimeHours = startTime.split(":")[0] if ":" in startTime: startTimeMinutes = ":" + startTime.split(":")[1] else: startTimeMinutes = ":00" if int(startTimeHours) in range(9,12): startTimeHours += startTimeMinutes + "AM" if ":" not in startTime: startTime +=":00" if "AM" not in startTime: startTime += endTime[-2:] return [startTime, endTime] except Exception as e: print(f"Error insertTime: Start-> {startTime}, endTime->{endTime}") print(e) return [0,0]，您可以执行以下操作：

dplyr

与df %>% group_by(treatment) %>% summarise_all(list(~ as.integer(any(.)))) treatment gene1 gene2 gene3 <fct> <int> <int> <int> 1 A 1 1 0 2 B 0 1 1相同：

base R

Answer 2

带有base R

的选项

+(rowsum(df[-1], df$treatment) > 0)
#    gene1 gene2 gene3
#A     1     1     0
#B     0     1     1

数据

df <- structure(list(treatment = c("A", "A", "A", "A", "A", "B", "B", 
"B", "B"), gene1 = c(0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L), gene2 = c(3L, 
0L, 0L, 1L, 0L, 1L, 5L, 0L, 0L), gene3 = c(0L, 0L, 0L, 0L, 0L, 
1L, 2L, 3L, 0L)), class = "data.frame", row.names = c(NA, -9L
))

如何在其他行的基础上添加行？

2 个答案:

数据