我有一个看起来像这样的数据框:
df <-
structure(
list(
Exception1 = c(
"Comments from {2}: {0}",
"status updated to {1} by {2}. Description:{0}",
"status updated to {1} by {2}. Description:{0}",
"information only.",
"status updated to {1} by {2}. Description:{0}",
"status updated to {1} by {2}. Description:{0}"
),
Exception2 = c(
"Customer {0} said bla",
"Status updated to {1}",
"Customer said {2}",
"User {0} foo",
"{0} {1}",
"{1} {2}"
),
ARGUMENT1 = c("OK", " ", " ", "PAY9089723089-98391", " ", " "),
ARGUMENT2 = c(
"null",
"Processing",
"Reconciled",
"null",
"Processing",
"Reconciled"
),
ARGUMENT3 = c(
"company name",
"company name",
"company name",
"null",
"company name",
"company name"
)
),
row.names = c(NA, 6L),
class = "data.frame"
)
:
| Exception1 | Exception2 | ARGUMENT1 | ARGUMENT2 | ARGUMENT3 |
|-----------------------------------------------|-----------------------|---------------------|------------|--------------|
| Comments from {2}: {0} | Customer {0} said bla | OK | null | company name |
| status updated to {1} by {2}. Description:{0} | Status updated to {1} | | Processing | company name |
| status updated to {1} by {2}. Description:{0} | Customer said {2} | | Reconciled | company name |
| information only. | User {0} foo | PAY9089723089-98391 | null | null |
| status updated to {1} by {2}. Description:{0} | {0} {1} | | Processing | company name |
| status updated to {1} by {2}. Description:{0} | {1} {2} | | Reconciled | company name |
Exception1和Exception 2列(出于可读性考虑,我删除了另外几个Exception列)包含占位符{},这些占位符将替换为ARGUMENT *列中的值。
我一直在寻找实现这一目标的方法,并且取得了相对成功,但我仍然缺乏将其做得更好的经验。
我写了一个简单的函数,可以通过gsub进行替换:
excp_ren2 <- function(x) {
x %<>%
gsub("\\{1\\}", x["ARGUMENT2"], .) %>%
gsub("\\{0\\}", x["ARGUMENT1"], .) %>%
gsub("\\{2\\}", x["ARGUMENT3"], .)
x
}
然后一直使用apply及其差异。我已经完成了一个好的结果,例如:
new_df <-
df %>% apply(
.,
MARGIN = 1,
FUN = function(x)
excp_ren2(x)
) %>% as.data.frame()
唯一的问题是这会转置矩阵,但这并不是真正的问题。
我正在寻找更好的方法来执行此操作,我以为可以通过mutate_ *来执行此操作,但是我认为我无法访问该函数内行的列名,或者至少我不知道怎么做。关于实现此目的的更简单方法的任何想法?
谢谢!
答案 0 :(得分:2)
我们可以在管道中使用str_replace
(已向量化),而不是按行执行此操作(并在每列而不是“ Exception1”上应用该函数)
library(stringr)
library(dplyr)
df %>%
transmute(new = str_replace_all(Exception1, "\\{1\\}", ARGUMENT2) %>%
str_replace_all("\\{0\\}", ARGUMENT1) %>%
str_replace_all("\\{2\\}", ARGUMENT3))
# new
#1 Comments from company name: OK
#2 status updated to Processing by company name. \\nDescription:\n
#3 status updated to Reconciled by company name. \\nDescription:\n
#4 PCard order invoices are for information only.
#5 status updated to Processing by company name. \\nDescription:\n
#6 status updated to Reconciled by company name. \\nDescription:\n
如果我们有多列,则可以使用mutate_at
或transmute_at
df %>%
transmute_at(vars(starts_with("Exception")), ~
str_replace_all(., "\\{1\\}", ARGUMENT2) %>%
str_replace_all("\\{0\\}", ARGUMENT1) %>%
str_replace_all("\\{2\\}", ARGUMENT3))
# Exception1 Exception2
#1 Comments from company name: OK Customer OK said bla
#2 status updated to Processing by company name. Description: Status updated to Processing
#3 status updated to Reconciled by company name. Description: Customer said company name
#4 information only. User PAY9089723089-98391 foo
#5 status updated to Processing by company name. Description: Processing
#6 status updated to Reconciled by company name. Description: Reconciled company name
答案 1 :(得分:1)
也许是这样
clean_pipe <- . %>%
mutate(new_string = Exception1 %>% str_replace_all(pattern = "\\{0\\}",replacement = ARGUMENT1)) %>%
mutate(new_string = new_string %>% str_replace_all(pattern = "\\{1\\}",replacement = ARGUMENT2)) %>%
mutate(new_string = new_string %>% str_replace_all(pattern = "\\{2\\}",replacement = ARGUMENT3))
df %>%
clean_pipe
答案 2 :(得分:1)
您使用{ }
进行划界的方式使我想到了使用glue
的方式,其工作方式与此类似。要使胶粘模板与数据中的列名匹配,请首先使用stringr::str_replace_all
中的命名列表,一步一步地将所有带替换的图案匹配。然后从glue
列中创建"Exception*"
对象。根据这篇文章(R dplyr: rowwise + mutate (+glue) - how to get/refer row content?),您将需要使用rowwise
,因为否则它将尝试为 each <每个使用每个参数列的值。 / em>模板。我想将两个mutate_at
步骤都放到一个函数中,但是在范围界定上遇到了一些麻烦,所以这是我可以工作的最简单的方法。
library(dplyr)
library(tidyr)
replacements <- c("\\{1\\}" = "{ARGUMENT2}",
"\\{0\\}" = "{ARGUMENT1}",
"\\{2\\}" = "{ARGUMENT3}")
as_tibble(df) %>%
rowwise() %>%
mutate_at(vars(starts_with("Exception")), stringr::str_replace_all, replacements) %>%
mutate_at(vars(starts_with("Exception")), ~as.character(glue::glue(.)))
#> Source: local data frame [6 x 5]
#> Groups: <by row>
#>
#> # A tibble: 6 x 5
#> Exception1 Exception2 ARGUMENT1 ARGUMENT2 ARGUMENT3
#> <chr> <chr> <chr> <chr> <chr>
#> 1 Comments from company name… Customer OK sai… OK null company …
#> 2 "status updated to Process… Status updated … " " Processi… company …
#> 3 "status updated to Reconci… Customer said c… " " Reconcil… company …
#> 4 information only. User PAY9089723… PAY908972308… null null
#> 5 "status updated to Process… " Processing" " " Processi… company …
#> 6 "status updated to Reconci… Reconciled comp… " " Reconcil… company …
请注意,由于某些字符串为空,因此结果中会有多余的空格,可以用trimws
进行修剪。