我在df中有一个功能,其中有一些缺失值显示为""。
unique(page_my_df$Type)
[1] "list" "narrative" "how to" "news feature"
[5] "diary" "" "interview"
我想替换""的所有实例用"未知"。
page_my_df <- page_my_df %>%
mutate(Type = str_replace(.$Type, "", "unknown"),
Voice = str_replace(.$Voice, "", "unknown"))
mutate_impl(.data,dots)出错:评估错误:未实现。
阅读一些文档here,特别是在模式下:
使用boundary()匹配字符,单词,行和句子边界。空模式&#34;&#34;等同于边界(&#34;字符&#34;)。
所以我试过了:
page_my_df <- page_my_df %>%
mutate(Type = str_replace(.$Type, boundary(""), "unknown"),
Voice = str_replace(.$Voice, boundary(""), "unknown"))
然后给出了:
mutate_impl(.data,dots)中的错误: 评估错误:&#39; arg&#39;应该是“character”,“line_break”,“sentence”,“word”之一。
如何用&#34; unknown&#34;替换空字符串?在dplyr :: mutate()中?
答案 0 :(得分:3)
这是一种方法:
library(tidyverse)
library(stringr)
z <- c( "list", "narrative", "how to", "news feature",
"diary", "" , "interview" )
data.frame(element = 1:length(z), Type = z) %>%
mutate(Type = str_replace(Type, "^$", "unknown"))
#output
element Type
1 1 list
2 2 narrative
3 3 how to
4 4 news feature
5 5 diary
6 6 unknown
7 7 interview
此外,无需使用.$
^和美元符号$是元字符,分别匹配行开头和结尾的空字符串。
答案 1 :(得分:2)
通过检查字符串长度的另一种解决方案:
library(dplyr)
strings <- c("list","narrative","how to","news feature","diary","","interview" )
df <- data.frame(ID = 1:length(strings), strings, stringsAsFactors = FALSE)
> df
ID strings
1 1 list
2 2 narrative
3 3 how to
4 4 news feature
5 5 diary
6 6
7 7 interview
df <- df %>% mutate(strings = if_else(nchar(strings) == 0, "unknown", strings))
> df
ID strings
1 1 list
2 2 narrative
3 3 how to
4 4 news feature
5 5 diary
6 6 unknown
7 7 interview