假设您有一个像这样的data.frame:
FDR_1 Label_1 FDR_2 Label_2
0.001 NA 0.45 NA
0.34 NA 6 NA
0.2 NA 3 NA
2 NA 2.5 NA
4 NA 0.001 NA
对于总共10.000行和3000列,您需要以下输出:
FDR_1 Label_1 FDR_2 Label_2
0.001 NA 0.45 NA
0.34 NA 6 Y
0.2 NA 3 Y
2 Y 2.5 Y
4 Y 0.001 NA
换句话说,您要将Y“标志”添加到FDR *列包含值> 2的行中。
我尝试过:
lapply(mydf, function(x) ifelse(mydf[, grepl( "FDR" , names(mydf) ) > 2, .....)
但是我不知道如何继续添加标志。
有人可以帮我吗?
提前谢谢
答案 0 :(得分:3)
我们可以在base R
中使用
df1[!i1] <- 'Y'[(NA^(df1[i1] <= 2))]
df1
# FDR_1 Label_1 FDR_2 Label_2
#1 0.001 <NA> 0.450 <NA>
#2 0.340 <NA> 6.000 Y
#3 0.200 <NA> 3.000 Y
#4 2.000 <NA> 2.500 Y
#5 4.000 Y 0.001 <NA>
其中
i1 <- grepl("^FDR", names(df1))
df1 <- structure(list(FDR_1 = c(0.001, 0.34, 0.2, 2, 4), Label_1 = c(NA,
NA, NA, NA, NA), FDR_2 = c(0.45, 6, 3, 2.5, 0.001), Label_2 = c(NA,
NA, NA, NA, NA)), class = "data.frame", row.names = c(NA, -5L
))
答案 1 :(得分:2)
我们可以使用底数为R的split.default
来拆分数字上的列,即
do.call(cbind,
lapply(split.default(df, gsub('\\D+', '',names(df))), function(i){
i[2] <- replace(i[2], i[1] >= 2, 'Y'); i}))
# 1.FDR_1 1.Label_1 2.FDR_2 2.Label_2
#1 0.001 <NA> 0.450 <NA>
#2 0.340 <NA> 6.000 Y
#3 0.200 <NA> 3.000 Y
#4 2.000 Y 2.500 Y
#5 4.000 Y 0.001 <NA>
答案 2 :(得分:1)
使用R
的循环“免费”基础reshape
变体:
df <- structure(list(FDR_1 = c(0.001, 0.34, 0.2, 2, 4),
Label_1 = c(NA, NA, NA, NA, NA),
FDR_2 = c(0.45, 6, 3, 2.5, 0.001),
Label_2 = c(NA, NA, NA, NA, NA)),
class = "data.frame",
row.names = c(NA, -5L))
mv <- lapply(split(names(df),
gsub("(.+)_\\d+",
"\\1",
names(df))), sort)
data_long <- reshape(df,
varying = mv,
direction = "long",
v.names = names(mv))
data_long$Label[data_long$FDR >= 2] <- "Y"
reshape(data_long)
# id FDR_1 Label_1 FDR_2 Label_2
# 1.1 1 0.001 <NA> 0.450 <NA>
# 2.1 2 0.340 <NA> 6.000 Y
# 3.1 3 0.200 <NA> 3.000 Y
# 4.1 4 2.000 Y 2.500 Y
# 5.1 5 4.000 Y 0.001 <NA>
答案 3 :(得分:0)
您也可以尝试tidyverse
library(tidyverse)
read.table(text=" FDR_1 Label_1 FDR_2 Label_2
0.001 NA 0.45 NA
0.34 NA 6 NA
0.2 NA 3 NA
2 NA 2.5 NA
4 NA 0.001 NA ", header=T) %>%
rownames_to_column() %>%
gather(k, v, -rowname) %>%
separate(k, into = c("k1", "k2")) %>%
spread(k1, v) %>%
mutate(Label = ifelse(FDR >= 2, "Y", Label)) %>%
gather(k, v, -rowname, -k2) %>%
unite(k, k2, k) %>% # changing the colnames a little bit
spread(k, v) %>%
select(-1)
1_FDR 1_Label 2_FDR 2_Label
1 0.001 <NA> 0.45 <NA>
2 0.34 <NA> 6 Y
3 0.2 <NA> 3 Y
4 2 Y 2.5 Y
5 4 Y 0.001 <NA>