如何将参加人数更改为特定值?

时间:2020-04-16 12:44:46

标签: r dataframe merge

我最近开始使用R,但之前从未进行过编码,因此我发现自己陷入了以下问题:

我有两个需要合并的数据框(具有不同的行和列长度)。合并本身不是问题,但是我对两个数据帧中变量的差异存在疑问。第一个数据帧将参与者描述为-1,-2,-3等。我的第二个数据帧将参与者描述为STR_PP001,STR_PP002,STR_PP003等。

目标是将所有数据组合到一个数据帧中,该数据帧将参与者描述为STR_PP001(或特定参与者的编号)。有没有一种方法可以转换第一个数据框中的列,使其将参与者代码显示为STR_PP而不是-1?

提前谢谢!

3 个答案:

答案 0 :(得分:2)

示例数据:

a <- paste0("-", 1:4)
a
#[1] "-1" "-2" "-3" "-4"

名称转换

b <- paste0("STR_PP00", sapply(strsplit(a, "-"),"[[", 2))
b
#[1] "STR_PP001" "STR_PP002" "STR_PP003" "STR_PP004"

基本上,此代码段的作用是用“-”分隔,其中strsplit()的输出是一个列表。然后,我们利用sapply()在列表中选择每个向量的第二个元素。之后,您可以利用paste0()将提取的数字和所需的前缀粘贴在一起。


更新以同时包含较高的ID

a <- paste0("-", 1:128)
b <- "STR_PP"
# Amount of zeros required, -1 because of the "-" that is counted in nchar() 
# -3 becasue the maximum length is 3 for id > 99 and times -1 because we 
# want positive numbers

zerolen <- ((nchar(a) - 1) - 3) * (-1)

# Now one can add the amount of required 0 based on the length of ID number

c <- sapply(zerolen, function(x){
paste(as.character((rep(0, x))), collapse = "")
})

# Again combine with paste()

paste0(b, c, sapply(strsplit(a, "-"),"[[", 2))

# Which results in:

head(paste0(b, c, sapply(strsplit(a, "-"),"[[", 2)), 20)

#  [1] "STR_PP001" "STR_PP002" "STR_PP003" "STR_PP004" "STR_PP005" 
#      "STR_PP006" "STR_PP007" "STR_PP008" "STR_PP009" "STR_PP010"
# [11] "STR_PP011" "STR_PP012" "STR_PP013" "STR_PP014" "STR_PP015" 
#      "STR_PP016" "STR_PP017" "STR_PP018" "STR_PP019" "STR_PP020"

答案 1 :(得分:1)

此嵌套的ifelse语句使用gsub和向后引用有效:

a <- c("-1", "-3", "-10", "-55", "-100", "-112")

ifelse(grepl("-\\d$", a),  paste0("STR_PP00", gsub("-(\\d)", "\\1", a)),
       ifelse(grepl("-\\d{2}$", a),  paste0("STR_PP0", gsub("-(\\d+)", "\\1", a)), 
              paste0("STR_PP", gsub("-(\\d+)", "\\1", a))))

[1] "STR_PP001" "STR_PP003" "STR_PP010" "STR_PP055" "STR_PP100" "STR_PP112"

答案 2 :(得分:0)

但是可以肯定的一种方法是: 如果您在第二个数据框中将变量称为VAR,则可以执行以下操作:

VAR[which(VAR == -1)] <- "STR_PP001"

,其他数字依此类推。如果-1是一个字符,则可能必须设置VAR[which(VAR == "-1")] <- "STR_PP001"