我有一个数据框:
chrom position ref var normal_reads1 normal_reads2 normal_var_freq normal_gt tumor_reads1 tumor_reads2 tumor_var_freq tumor_gt somatic_status variant_p_value somatic_p_value
1 2L 13048 A T 32 23 41.82 W 17 6 26.09 W Germline 7.507123e-11 0.9437542
2 2L 16467 G A 0 43 100.00 A 0 24 100.00 A <NA> 6.674261e-40 1.0000000
3 2L 20682 T A 32 14 30.43 W 14 6 30.00 W Germline 1.746726e-07 0.6223244
4 2L 25727 T G 52 22 29.73 K 16 4 20.00 K Germline 2.000049e-09 0.8758070
5 2L 25729 A T 49 23 31.94 W 16 4 20.00 W Germline 7.938282e-10 0.9092970
6 2L 25741 T C 45 28 38.36 Y 15 6 28.57 Y Germline 1.497796e-12 0.8604958
如果somatic_status
和normal_var_freq
都tumor_var_freq
> 90
col的值更改为“ROH”
这是我尝试过的:
snps <- within(snps, somatic_status[normal_var_freq > 90 & tumor_var_freq > 90] <- 'ROH')
但是我收到了错误:
Warning message:
In `[<-.factor`(`*tmp*`, normal_var_freq > 90 & tumor_var_freq > :
invalid factor level, NA generated
有人能指出我正确的方向吗?
答案 0 :(得分:1)
我们可以factor
到character
class
,然后根据逻辑向量('i1')将值分配给'ROH'
i1 <- with(snps, normal_var_freq > 90 & tumor_var_freq > 90)
snps$somatic_status <- as.character(snps$somatic_status)
snps$somatic_status[i1] <- "ROH"
如果我们不想在将某些元素更改为新值之前将level
列转换为factor
,请或向该列添加新的character
levels(snps$somatic_status) <- c(levels(snps$somatic_status), "ROH")
snps$somatic_status[i1] <- "ROH"
关于within
的用法,它是用于创建新变量或更新旧变量的有用函数,但不建议将值的子集分配给新值