我想将值加倍为仅包含'hesc'的行。例如,带有5.929771的第4行应为11.859542(5.929771 * 2)。
dplyr软件包中的filter()函数根本没有帮助,因为我不希望提取某些值来创建新的数据帧。
[数据帧] 22x3
expr_value cell_type
1 5.345618 fibroblast
2 5.195871 fibroblast
3 5.247274 fibroblast
4 5.929771 hesc
5 5.873096 hesc
6 5.665857 hesc
7 6.791656 hips
8 7.133673 hips
9 7.574058 hips
10 7.208041 hips
11 7.402100 hips
12 7.167792 hips
13 7.156971 hips
14 7.197543 hips
15 7.035404 hesc
16 7.269474 hesc
17 6.715059 hesc
18 7.434339 hips
19 6.997586 hips
20 7.619770 hips
21 7.490749 hips
在这种情况下,如何对某些行应用简单的算术? 我试图编写类似“ if(dataframe $ cell_type =“ hesc”)....“的代码。 猜想会有更好的方法。
已经知道dplyr中的filter()可用于提取我只关心的某些行,但是如果我理解正确的话,它只会为我提供一个新的数据框。我想做的是,在数据框中找到某些行,然后立即对值应用一些算术。
答案 0 :(得分:1)
您也可以通过以下方式进行操作:
expr_value
,其中cell_type1 =="hesc"
df[df$cell_type1 =="hesc",]["expr_value"]
# expr_value
#4 5.929771
#5 5.873096
#6 5.665857
2
)df[df$cell_type1 =="hesc",]["expr_value"] *2
df[df$cell_type1 =="hesc",]["expr_value"] <- df[df$cell_type1 =="hesc",]["expr_value"] *2
以上两个(1
和2
)是解释。您需要做的就是3
。
df <- structure(list(expr_value = c(5.345618, 5.195871, 5.247274, 5.929771,
5.873096, 5.665857, 6.791656, 7.133673, 7.574058, 7.208041, 7.4021,
7.167792, 7.156971, 7.197543, 7.035404, 7.269474, 6.715059, 7.434339,
6.997586, 7.61977, 7.490749), cell_type1 = structure(c(1L, 2L,
3L, 4L, 5L, 6L, 20L, 21L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L,
16L, 17L, 18L, 19L, 7L), .Label = c("fibroblast", "fibroblast",
"fibroblast", "hesc", "hesc", "hesc", "hips", "hips", "hips",
"hips", "hips", "hips", "hips", "hips", "hips", "hips", "hips",
"hips", "hips", "hips", "hips"), class = "factor")), class = "data.frame", row.names = c(NA,
-21L))
如果您的列中有NA
,则可以通过以下方式约束就地分配
df[!is.na(df$cell_type1) & df$cell_type1 =="hesc",]["expr_value"] <- df[!is.na(df$cell_type1) & df$cell_type1 =="hesc",]["expr_value"] *2
# expr_value
#4 11.85954
#5 11.74619
#6 11.33171
答案 1 :(得分:0)
我们可以使用grep
用cell_type1
找到"hesc"
,然后将它们乘以2
inds <- grep("hesc", df$cell_type1)
df$expr_value[inds] <- df$expr_value[inds] * 2
或者没有中间变量
library(dplyr)
df %>% mutate(expr_value = expr_value * c(1, 2)[grepl("hesc", cell_type1) + 1])
# expr_value cell_type1
#1 5.345618 fibroblast2
#2 5.195871 fibroblast3
#3 5.247274 fibroblast4
#4 11.859542 hesc5
#5 11.746192 hesc6
#6 11.331714 hesc7
#7 6.791656 hips8
#8 7.133673 hips9
#9 7.574058 hips10
#....
数据
df <- structure(list(expr_value = c(5.345618, 5.195871, 5.247274, 5.929771,
5.873096, 5.665857, 6.791656, 7.133673, 7.574058, 7.208041, 7.4021,
7.167792, 7.156971, 7.197543, 7.035404, 7.269474, 6.715059, 7.434339,
6.997586, 7.61977, 7.490749), cell_type1 = structure(c(1L, 2L,
3L, 4L, 5L, 6L, 20L, 21L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L,
16L, 17L, 18L, 19L, 7L), .Label = c("fibroblast2", "fibroblast3",
"fibroblast4", "hesc5", "hesc6", "hesc7", "hips", "hips10", "hips11",
"hips12", "hips13", "hips14", "hips15", "hips16", "hips17", "hips18",
"hips19", "hips20", "hips21", "hips8", "hips9"), class = "factor")),
class = "data.frame", row.names = c(NA, -21L))