我有一个数据框,由于各种原因,我需要将其中一个元素作为一个因素,并保持关卡的顺序,用空格替换关卡中的句点。这是一个例子
library(tidyverse) library(stringr)
sandwich <- c("bread", "mustard.sauce", "tuna.fish", "lettuce", "bread")
data_frame(sandwich_str = sandwich) %>%
mutate(sandwich_factor = factor(sandwich)) %>%
mutate(sandwich2 = factor(sandwich_factor,
levels = str_replace_all(levels(sandwich_factor), "\\.", " "))) %>%
mutate(sandwich3 = str_replace_all(sandwich_str, "\\.", " "))
print(sandwich_df)
# A tibble: 5 x 4 sandwich_str, sandwich_factor, sandwich2, sandwich3 <chr> <fctr>, <fctr> <chr>, 1 bread bread bread bread 2 mustard.sauce mustard.sauce <NA> mustard sauce 3 tuna.fish tuna.fish <NA> tuna fish 4 lettuce lettuce lettuce lettuce 5 bread bread bread bread
所以在这个数据框中:
sandwich_str
是字符元素
sandwich_factor
是因素
sandwich2
中的我尝试替换sandwich_factor
级别中的所有句点。无论出于何种原因,只要有句号,就会返回NA。
sandwich3
中的我采用更简单的方法,用空格替换字符串中的所有句点。这种方法效果更好。
所以我想知道在尝试夹心2时什么不起作用。我希望它看起来更像三明治3。有什么建议吗?
答案 0 :(得分:1)
这适合吗?
library(tidyverse)
library(stringr)
# Data --------------------------------------------------------------------
sandwich <-
c("bread", "mustard.sauce", "tuna.fish", "lettuce", "bread")
df <-
data_frame(sandwich_str = sandwich)
# Convert periods to spaces -----------------------------------------------
df$sandwich_str <-
df$sandwich_str %>%
as.character() %>%
str_replace("\\."," ") %>%
as.factor()
# Print output ------------------------------------------------------------
df %>%
print()
答案 1 :(得分:0)
感谢@aosmith将此答案作为评论发布。我会在这里发布它作为答案,所以我可以接受并关闭它。
问题是因子级别是使用标志标签而不是级别定义的。所以我之前写这篇文章的正确方法是:
library(tidyverse) library(stringr)
sandwich <- c("bread", "mustard.sauce", "tuna.fish", "lettuce", "bread")
data_frame(sandwich_str = sandwich) %>%
mutate(sandwich_factor = factor(sandwich)) %>%
mutate(sandwich2 = factor(sandwich_factor,
labels = str_replace_all(levels(sandwich_factor), "\\.", " "))) %>%
mutate(sandwich3 = str_replace_all(sandwich_str, "\\.", " "))
print(sandwich_df)