这可能已被多次询问,但我发现的情况比我的更复杂,而且我真的不知道从哪里开始。我需要在我的数据框中添加一个新列(Condition
)并根据列cellNr
我的数据框是molten.pC
:
cellNr value
1 G63 0.000000
2 G64 8.848623
3 G65 0.000000
4 G66 10.788718
5 B15 5.285402
6 B16 0.000000
7 B17 0.000000
8 C10 0.000000
9 C11 0.000000
我想添加一列Condition
并填写如下:
cellNr value Condition
1 G63 0.000000 Growth
2 G64 8.848623 Growth
3 G65 0.000000 Growth
4 G66 10.788718 Growth
5 B15 5.285402 Burst
6 B16 0.000000 Burst
7 B17 0.000000 Burst
8 C10 0.000000 Cellularized
9 C11 0.000000 Cellularized
答案 0 :(得分:2)
我们可以通过提取第一个字符(base R
)在substr
中执行此操作,转换为factor
并指定labels
和levels
。
molten.pC$Condition <- as.character(factor(substr(molten.pC$cellNr, 1, 1),
levels = c("G", "B", "C"), labels = c("Growth", "Burst", "Cellularized")))
molten.pC$Condition
#[1] "Growth" "Growth" "Growth" "Growth" "Burst"
#[6] "Burst" "Burst" "Cellularized" "Cellularized"
或者我们可以使用case_when
dplyr
library(dplyr) #devel version (soon to be released `0.6.0`)
molten.pC %>%
mutate(Sub = substr(cellNr, 1, 1),
Condition = case_when(Sub=="G" ~"Growth",
Sub == "B" ~"Burst",
TRUE ~"Cellularized")) %>%
select(-Sub)
# cellNr value Condition
#1 G63 0.000000 Growth
#2 G64 8.848623 Growth
#3 G65 0.000000 Growth
#4 G66 10.788718 Growth
#5 B15 5.285402 Burst
#6 B16 0.000000 Burst
#7 B17 0.000000 Burst
#8 C10 0.000000 Cellularized
#9 C11 0.000000 Cellularized