修改R

时间:2017-04-05 23:39:07

标签: r

我正在使用数据表,其中第一列(" first")包含字符串(这是一个子集):

  >first

  #[1] "A10"    "A10r"   "A1112"  "A1112r" "A116"   "A116r"  "A1212"  "A1212r" "A126"   "A126r"  "A1312"  "A1312r" "A136"   "A136r"  "A20"    "A20r"  
 #[17] "A2112"  "A2112r" "A216"   "A216r"  "A2212"  "A2212r" "A226"   "A226r"  "A2312"  "A2312r" "A236"   "A236r"  "A30"    "A30r"   "A3112"  "A3112r"

我试图让最终格式包含6个元素,在不同位置添加特定元素。

我使用以下命令从头到尾: 添加" s"所有不包含" r":

的字符串
 >middle1<-ifelse(!grepl("r",first),paste0(first,"s"),first)

 #[1] "A10s"   "A10r"   "A1112s" "A1112r" "A116s"  "A116r"  "A1212s" "A1212r" "A126s"  "A126r"  "A1312s" "A1312r" "A136s"  "A136r"  "A20s"   "A20r"  
 #[17] "A2112s" "A2112r" "A216s"  "A216r"  "A2212s" "A2212r" "A226s"  "A226r"  "A2312s" "A2312r" "A236s"  "A236r"  "A30s"   "A30r"   "A3112s" "A3112r"

然后使用以下命令添加数字&#34; 0&#34;在第一个元素之后,仅当少于5个元素时。

>middle2<-ifelse(nchar(middle1)<5,gsub('^(.{1})(.*)$','\\10\\2',middle1[nchar(middle1)<5]), middle1)

#[1] "A010s"  "A010r"  "A1112s" "A1112r" "A116s"  "A116r"  "A1212s" "A1212r" "A126s"  "A126r"  "A1312s" "A1312r" "A136s"  "A136r"  "C020s"  "C020r" 
# [17] "A2112s" "A2112r" "A216s"  "A216r"  "A2212s" "A2212r" "A226s"  "A226r"  "A2312s" "A2312r" "A236s"  "A236r"  "B030s"  "B030r"  "A3112s" "A3112r"

然后我重复了上一个命令,这次添加一个数字&#34; 0&#34;在第三个元素之后,仅当少于6个元素时。这让我达到了6.

>last<-ifelse(nchar(middle2)<6,gsub('^(.{3})(.*)$','\\10\\2',middle2[nchar(middle2)<6]),middle2)

 #[1] "A0100s" "A0100r" "A1112s" "A1112r" "A1206s" "A1206r" "A1212s" "A1212r" "C0200s" "C0200r" "A1312s" "A1312r" "A2206s" "A2206r" "A2306s" "A2306r"
 #[17] "A2112s" "A2112r" "A3106s" "A3106r" "A2212s" "A2212r" "A3306s" "A3306r" "A2312s" "A2312r" "A4206s" "A4206r" "A4306s" "A4306r" "A3112s" "A3112r"

然而,我遇到的问题是向量中的位置已被移动(&#34; C0200s&#34;,&#34; C0200r&#34;已改变位置)。最终,我需要使用这些字符串来标记行,并且它们需要处于原始位置。我是新手,所以如果问这个问题很明显,或者我写了不正确的话,我会提前道歉。

所以我的问题是: 如何在不重新排序向量的情况下修改R中的字符串?

1 个答案:

答案 0 :(得分:0)

这是实现逻辑的更简单方法。另外,我建议在整个过程中进行健全性检查,以确保逻辑稳健。

first <- c("A10","A10r","A1112","A1112r","A116","A116r","A1212","A1212r","A126","A126r","A1312","A1312r","A136","A136r","A20","A20r",
    "A2112","A2112r","A216","A216r","A2212","A2212r","A226","A226r","A2312","A2312r","A236","A236r","A30","A30r","A3112","A3112r")

library(stringi)
unlist(lapply(stri_split_boundaries(first, type="character"), function(x) {

    if (length(x) < 3) {
        print(x)
        stop("Logic will not apply correctly")
    }

    #add an "s" to all strings not containing "r"
    if (tail(x, 1) != "r") x <- c(x, "s")

    if (length(x) < 4) {
        print(x)
        stop("Logic will not apply correctly")
    }

    #add a digit "0" after the first element, only if there were fewer than 5 elements.
    if (length(x) < 5) x <- c(x[1], "0", x[-1])

    if (length(x) < 5) {
        print(x)
        stop("Logic will not apply correctly")
    }

    #adding a digit "0" after the third element, only if there were fewer than 6 elements
    if (length(x) < 6) x <- c(x[seq_len(3)], "0", x[-seq_len(3)])

    if (length(x) != 6) {
        print(x)
        stop("Check logic.")   
    }

    paste(x, collapse="")
}))