R:根据先前列中的值简化列的添加

时间:2014-08-06 18:27:42

标签: r dataframe

我有一个功能,可以根据现有列向数据框添加新列。我的代码目前看起来像:

df <- data.frame("chr" = c("chr1", "chr2", "chr3", "chrX"), "B" = c("a", "c", "d", "b"))

df$chr <- factor(df$chr, levels = c("chr1", "chr2", "chr3", "chrX")) # Not really necessary here...

我使用以下函数添加一个带有染色体数字整数值的新列。我想知道是否有更简单的方法来做到这一点,也许利用因子水平。同时用整数值替换当前的df $ chr列也可以。

AddChr <- function(DataFrame){
  DataFrame$Chr <- NA
  DataFrame$Chr[DataFrame$chr == "chr1"] <- 1
  DataFrame$Chr[DataFrame$chr == "chr2"] <- 2
  DataFrame$Chr[DataFrame$chr == "chr3"] <- 3
  DataFrame$Chr[DataFrame$chr == "chrX"] <- 20
  DataFrame$Chr <- as.integer(DataFrame$Chr)
  return(DataFrame)
}

df <- AddChr(df)

2 个答案:

答案 0 :(得分:2)

此解决方案创建一个命名向量,将您的标签转换为新标签。

您希望最后将数字1到21作为标签:1:21

您要翻译的名称是字符chr,后跟c(1:19, "X", "Y")

paste0("chr", c(1:19, "X", "Y"))
#  [1] "chr1"  "chr2"  "chr3"  "chr4"  "chr5"  "chr6"  "chr7"  "chr8"  "chr9"  "chr10"
# [11] "chr11" "chr12" "chr13" "chr14" "chr15" "chr16" "chr17" "chr18" "chr19" "chrX" 
# [21] "chrY

如果使用第二个向量命名第一个向量,则会得到映射:

setNames(1:21, paste0("chr", c(1:19, "X", "Y")))
#  chr1  chr2  chr3  chr4  chr5  chr6  chr7  chr8  chr9 chr10 chr11 chr12 chr13 chr14 
#     1     2     3     4     5     6     7     8     9    10    11    12    13    14 
# chr15 chr16 chr17 chr18 chr19  chrX  chrY 
#    15    16    17    18    19    20    21

然后用你的矢量子集:

# setNames(1:21, paste0("chr", c(1:19, "X", "Y")))[df$chr]
# chr1 chr2 chr3 chr4 
#    1    2    3    4 

答案 1 :(得分:1)

对于您的具体示例,这将起作用

df$Chr <- ifelse(grepl("\\d", df$chr), gsub("[[:alpha:]]", "", df$chr), 20)
df
##    chr B Chr
## 1 chr1 a   1
## 2 chr2 c   2
## 3 chr3 d   3
## 4 chrX b  20