试图根据字符“|”拆分字符串不起作用

时间:2016-01-29 18:59:06

标签: r

我有一个文本字符串,我想根据字符|(垂直条/管道,即带有ascii代码124的字符)进行拆分。

尝试这样做时,我的字符串会在每个字符上分开。也就是说,以下代码

string <- "Hello | Good bye!"
split <- strsplit(string, "|")
print(split[[1]])

生成此输出

[1] "H" "e" "l" "l" "o" " " "|" " " "G" "o" "o" "d" " " "b" "y" "e" "!"

如果我只是将|符号更改为/(或任何其他字符),它会按预期工作。也就是说,以下代码

string <- "Hello / Good bye!"
split <- strsplit(string, "/")
print(split[[1]])

生成此输出

[1] "Hello "     " Good bye!"

这就是我想要的。

1 个答案:

答案 0 :(得分:4)

您需要使用fixed = TRUE来解释元字符,例如|字面上:

string <- "Hello | Good bye!"
strsplit(string, "|", fixed = TRUE)
#[[1]]
#[1] "Hello "     " Good bye!"

同样,

strsplit("Hello . Good bye!", ".")[[1]]
#[1] "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
strsplit("Hello . Good bye!", ".", fixed = TRUE)[[1]]
#[1] "Hello "     " Good bye!"

或者,您可以使用双反斜杠手动转义此类字符

strsplit("Hello | Good bye!", "\\|")[[1]]
#[1] "Hello "     " Good bye!"

或用\\Q...\\E包装它们,这将转义所有非字母数字字符:

strsplit("Hello | Good bye!", "\\Q|\\E")[[1]]
#[1] "Hello "     " Good bye!"