R中与数字分开的单词

时间:2014-02-18 10:02:16

标签: r

如何在单词和数字之间创建空格?我一直在搜索,但没有任何东西出现在R。

我有这样的数据框:

1 This is the case34
2 To to that case23/234
3 Only Monday223.23

期望的输出:

1 "This is the","case 34"
2 "To to that","case 23/234"
3 "Only","Monday 223.23"

如此单独的数字,如何处理这个?

4 个答案:

答案 0 :(得分:2)

这适用于数字前面的任何字符

gsub("(\\D)([0-9])","\\1 \\2","papà3")
[1] "papà 3"

编辑:

响应@Max评论使用[[:alpha:]]解决的双空格问题,(如G.格洛腾迪克回答)

> gsub("([[:alpha:]])([0-9])","\\1 \\2","This is the case 34")
[1] "This is the case 34"
> gsub("([[:alpha:]])([0-9])","\\1 \\2","This is the case34")
[1] "This is the case 34"

它适用于重音符号(如上面的papà3)而不是标点符号(\\D会),这可能是OP不想要的。

答案 1 :(得分:2)

尝试使用read.tablesub,如下所示:

# data in reproducible form
DF <- data.frame(s=c("This is the case34", "To to that case23/234", "Only Monday223.23"))

read.table(text = sub("(.*\\S) ([[:alpha:]]+)(\\d.+)$", "\\1,\\2 \\3", DF$s), sep = ",")

给出了这个数据框:

           V1            V2
1 This is the       case 34
2  To to that   case 23/234
3        Only Monday 223.23

请在下次以可复制的形式提供您的数据。

答案 2 :(得分:1)

Yu可以使用正则表达式执行此任务。这应该有效:

x<-c('This is the case34','To to that case23/234','Only Monday223.23')
gsub('([A-Za-z])([0-9])','\\1 \\2',x)
# [1] "This is the case 34"    "To to that case 23/234" "Only Monday 223.23"  

但是如果你的角色中有重音符号,你必须通过为这组字母添加更多的可能性来处理它。如果你想在数字后跟一个字母之间放置空格,你可以只做第二遍并反转正则表达式。

答案 3 :(得分:0)

lapply和regexpr:

的解决方案
x <- c("This is the case34", "To to that case23/234", "Only Monday223.23")
(xx <- lapply(strsplit(x, split=" "), function(x) c(paste(x[!grepl(pattern="\\d",
x=x)], collapse=" "), paste(gsub('([A-Za-z])([0-9])','\\1 \\2', x[grepl(pattern="\\d",
x=x)]), collapse=" "))))

[[1]]
[1] "This is the" "case 34"    

[[2]]
[1] "To to that"  "case 23/234"

[[3]]
[1] "Only"          "Monday 223.23"

do.call("rbind", xx)
     [,1]          [,2]           
[1,] "This is the" "case 34"      
[2,] "To to that"  "case 23/234"  
[3,] "Only"        "Monday 223.23"