str_replace模式是""在mutate_impl(.data,dots)中给出"错误:评估错误:未实现。"

时间:2017-12-20 10:18:33

标签: r dplyr

我在df中有一个功能,其中有一些缺失值显示为""。

unique(page_my_df$Type)
[1] "list"              "narrative" "how to"            "news feature"     
[5] "diary"     ""                  "interview" 

我想替换""的所有实例用"未知"。

page_my_df <- page_my_df %>% 
  mutate(Type = str_replace(.$Type, "", "unknown"),
         Voice = str_replace(.$Voice, "", "unknown"))
  

mutate_impl(.data,dots)出错:评估错误:未实现。

阅读一些文档here,特别是在模式下:

  

使用boundary()匹配字符,单词,行和句子边界。空模式&#34;&#34;等同于边界(&#34;字符&#34;)。

所以我试过了:

page_my_df <- page_my_df %>% 
  mutate(Type = str_replace(.$Type, boundary(""), "unknown"),
         Voice = str_replace(.$Voice, boundary(""), "unknown"))

然后给出了:

  

mutate_impl(.data,dots)中的错误:     评估错误:&#39; arg&#39;应该是“character”,“line_break”,“sentence”,“word”之一。

如何用&#34; unknown&#34;替换空字符串?在dplyr :: mutate()中?

2 个答案:

答案 0 :(得分:3)

这是一种方法:

library(tidyverse)
library(stringr)

z <- c( "list",  "narrative",  "how to",  "news feature",  
"diary",  "" , "interview" )

data.frame(element = 1:length(z), Type = z) %>%
  mutate(Type = str_replace(Type, "^$", "unknown"))
#output
  element         Type
1       1         list
2       2    narrative
3       3       how to
4       4 news feature
5       5        diary
6       6      unknown
7       7    interview

此外,无需使用.$

引用mutate调用中的数据框

^和美元符号$是元字符,分别匹配行开头和结尾的空字符串。

答案 1 :(得分:2)

通过检查字符串长度的另一种解决方案:

library(dplyr)

strings <- c("list","narrative","how to","news feature","diary","","interview" )
df <- data.frame(ID = 1:length(strings), strings, stringsAsFactors = FALSE)

> df
  ID      strings
1  1         list
2  2    narrative
3  3       how to
4  4 news feature
5  5        diary
6  6             
7  7    interview

df <- df %>% mutate(strings = if_else(nchar(strings) == 0, "unknown", strings))

> df
  ID      strings
1  1         list
2  2    narrative
3  3       how to
4  4 news feature
5  5        diary
6  6      unknown
7  7    interview