如何通过给定的模式将字符串分成多列?

时间:2019-05-03 09:36:51

标签: r regex stringr

我的文本数据如下:

\r\n    \r\n        How to get a confirm ticket?\r\n        \r\n            I want to get a tatkal ticket confirm ...

如何从此数据中提取两列?

我尝试过str_split_fixed(),它分为四列,在这四列之后,可以检索到两列...但是我希望它只给出两列。

x <- "\r\n    \r\n        How to get a confirm ticket?\r\n        \r\n            I want to get a tatkal ticket confirm ..."
str_split_fixed(x, "\r\n", 4)
#>      [,1] [,2]   [,3]                                   [,4]                                                               
#> [1,] ""   "    " "        How to get a confirm ticket?" "        \r\n            I want to get a tatkal ticket confirm ..."
str_split_fixed(x, "\r\n", 4)[1, 3]
#> [1] "        How to get a confirm ticket?"

1 个答案:

答案 0 :(得分:1)

如果字符串始终采用相同的格式,则以下正则表达式应该可以正常工作:

library(stringr)
x <- "\r\n    \r\n        How to get a confirm ticket?\r\n        \r\n            I want to get a tatkal ticket confirm ..."
str_split(x, "(\r\n\\s*)+", simplify = TRUE)[, -1, drop = FALSE]
     [,1]                           [,2]                                       
[1,] "How to get a confirm ticket?" "I want to get a tatkal ticket confirm ..."

如果您的数据实际上来自文本文件中的表格或来自网页,则可能有更方便的选择。