如何在数据框中将特殊字符转换为空白

时间:2018-01-04 06:05:09

标签: r

我有一个df,其中包含以#34开头的特殊字符 - "如下图所示df

A = c("A","A","A","A","A")
B =c("---","21","31","423","e")
C = c("0","0","----","p","1.75")
D = c("10","-----","d","-","1.3")
E = c("0","---","N","1.5","1.75")
df =  data.frame(A,B,C,D,E)

我在尝试将值作为空白值时出错,这些值以" - "开头。使用以下代码,

df1 = str_replace_all(df, grepl("-",df), " ")

提前致谢

1 个答案:

答案 0 :(得分:0)

我们可以使用grepl执行此操作,因为vector/matrix适用于data.frame,而不适用于library(dplyr) df %>% mutate_all(funs(replace(., grepl('-', .), ''))) # A B C D E #1 A 0 10 0 #2 A 21 0 #3 A 31 d N #4 A 423 p 1.5 #5 A e 1.75 1.3 1.75

str_replace

或使用library(stringr) df %>% mutate_all(funs(str_replace(., "^-+$", "")))

base R

使用lapply,我们可以使用df[] <- lapply(df, function(x) replace(x, grepl('-', x), ''))

character

最好创建factor列而不是stringsAsFactors = FALSE。在data.frame中使用df <- data.frame(A,B,C,D,E, stringsAsFactors = FALSE) 执行该操作

数据

sed -E 's/^>[0-9]+_/>/; s/_[0-9]+ *$//' file