我试图清理数据。我遇到了两件我无法找到解决办法的事情:
我有以下字符矢量:
"4353545 Here comes sentence."
"and now one more"
我想用空字符串替换它们。我试过第一个:
gsub("\\^[0-9].*","",dataframe$column) # if it starts with number replace with empty string
第二个:
gsub("\\^[a-z].*","",dataframe$column) # when it starts with letter instead of number = empty string
但是在这种情况下它起作用:
"! andn now one more"
gsub("\\!.*","",dataframe$column) # here this solution works; it starts with excl. and its replaced with empty string
答案 0 :(得分:4)
您可以使用正则表达式^[0-9a-z](.*)
来匹配以数字或小写字母开头的字符串。然后使用gsub
代替查找所有匹配项的sub
,因为您只想要第一个匹配项。
> ( x <- c("4353545 Here comes sentence.", "and now one more",
"! andn now one more") )
# [1] "4353545 Here comes sentence." "and now one more"
# [3] "! andn now one more"
> sub("^[0-9a-z](.*)", "", x)
# [1] "" "" "! andn now one more"
注意:正如nrussell指出的那样,当它表示字符串的开头时,你不应该逃避^
。对于字符串的开头,只需按原样使用^
。