希望在R中创建一个15位数的salesforce ID到18位的转换。公式写在这里:https://salesforce.stackexchange.com/questions/27686/how-can-i-convert-a-15-char-id-value-into-an-18-char-id-value 但这是在C#中,我想在R中这样做。
我已经在R中做了一个笨重的公式,它可以用于15位输入并成功返回18位数字。我想知道如何通过dplyr将其应用于data.frame中的列。
可重现的代码:
SFID_Convert <- function(fifteen_digit) {
if (length(fifteen_digit == 15)) {
# binary map ----
binary <-
c(
"00000",
"00001",
"00010",
"00011",
"00100",
"00101",
"00110",
"00111",
"01000",
"01001",
"01010",
"01011",
"01100",
"01101",
"01110",
"01111",
"10000",
"10001",
"10010",
"10011",
"10100",
"10101",
"10110",
"10111",
"11000",
"11001",
"11010",
"11011",
"11100",
"11101",
"11110",
"11111"
)
letter <- c(LETTERS, 0:5)
binarymap <- data_frame(binary, letter)
# sfid ----
sfid <- substr(fifteen_digit, 1, 15)
s1 <- substr(sfid, 1, 5)
s2 <- substr(sfid, 6, 10)
s3 <- substr(sfid, 11, 15)
convertID <- function(str_frag) {
str_frag <- paste(rev(strsplit(str_frag, NULL)[[1]]), collapse = '')
str_frag <- strsplit(str_frag, NULL)[[1]]
str_frag[which(unlist(gregexpr("[0-9]", str_frag)) == 1)] <- 0
str_frag[which(unlist(gregexpr("[a-z]", str_frag)) == 1)] <- 0
str_frag[which(unlist(gregexpr("[A-Z]", str_frag)) == 1)] <- 1
str_frag <<- paste(str_frag, collapse = '')
}
convertID(s1)
n1 <- str_frag
convertID(s2)
n2 <- str_frag
convertID(s3)
n3 <- str_frag
binary <- data_frame(c(n1, n2, n3)) %>%
select(binary = 1) %>%
left_join(binarymap)
return(paste(sfid, paste(binary$letter[1:3], collapse = ''), sep = ''))}
}
示例:
sfid <- "001a003920aSDuh"
SFID_Convert(sfid)
[1] "001a003920aSDuhAAG"
这就是我想要的,但是当你把它应用到df ......
时col <- c("001a003920aSDuh", "001a08h010JNkJd")
name <- c("compA", "compB")
df <- data_frame(name, col)
它为第一个正确计算了“AAG”,并将其应用于每一行。我可以lapply
它,但如果我有10万行的df,我认为这是错误的方法。
任何帮助表示赞赏!还在这里学习。 :)
答案 0 :(得分:2)
您的代码存在各种问题。我在下面提供了一个可能的解决方案,它应该更有效:
1:定义二进制字符串&amp;之间的映射字母。您可以在功能之外执行此操作。只需定义一次,包含所有必要的转换,&amp;在函数中使用它。
binary <- c("00000","00001","00010","00011","00100","00101","00110","00111",
"01000","01001","01010","01011","01100","01101","01110","01111",
"10000","10001","10010","10011","10100","10101","10110","10111",
"11000","11001","11010","11011","11100","11101","11110","11111")
binary.reverse <- lapply(binary, function(x){paste0(rev(strsplit(x, split = "")[[1]]), collapse = "")})
binary2letter <- c(LETTERS, 0:5)
names(binary2letter) <- unlist(binary.reverse)
rm(binary, binary.reverse)
我也在这一步中颠倒了二进制字符串,因此我不必为所有ID重复这些操作。结果保存在命名向量而不是数据框中。
2:以接受矢量作为输入的方式创建函数。请注意,要检查字符串是否包含X个字符,您应该使用nchar()
而不是length()
。后者返回字符串的数量,而不是字符串中的字符数。
SFID_Convert <- function(sfid) {
sfid <- as.character(sfid) # in case the input column are factors
str_num <- gsub("[A-Z]", "1", gsub("[a-z0-9]", "0", sfid))
s1 <- substring(str_num, 1, 5)
s2 <- substring(str_num, 6, 10)
s3 <- substring(str_num, 11, 15)
sfid.addon <- paste0(sfid,
binary2letter[s1],
binary2letter[s2],
binary2letter[s3])
sfid[nchar(sfid)==15] <- sfid.addon[nchar(sfid)==15]
return(sfid)
}
检查解决方案:
sfid <- "001a003920aSDuh"
col <- c("001a003920aSDuh", "001a08h010JNkJd")
name <- c("compA", "compB")
df <- data_frame(name, col)
> SFID_Convert(sfid)
[1] "001a003920aSDuhAAG"
> df %>% mutate(new.col = SFID_Convert(col))
# A tibble: 2 x 3
name col new.col
<chr> <chr> <chr>
1 compA 001a003920aSDuh 001a003920aSDuhAAG
2 compB 001a08h010JNkJd 001a08h010JNkJdAAL