我有一个数据框,其中的一列包含文本字符串:
1 Blue, Tall, leather, VA
2 Green, Medium, VA*
3 Pink, MD
4 Yellow, MA
最后的2个,有时带有'*'的3个是状态名称。我希望能够提取每一行左侧或最后一个','的所有内容。在r中完成此操作的最佳方法是什么?
我是新来的,所以请帮助
我希望输出为:
1 Blue, Tall, leather
2 Green, Medium
3 Pink
4 Yellow
答案 0 :(得分:1)
使用正则表达式:
vector <- c("Blue, Tall, leather, VA", "Green, Medium, VA*", "Pink, MD", "Yellow, MA")
sub("^(.*),.*$", "\\1", vector)
答案 1 :(得分:1)
split
,然后用逗号分隔paste
,vector <- c("Blue, Tall, leather, VA", "Green, Medium, VA*", "Pink, MD", "Yellow, MA")
sapply(X = strsplit(x = vector, split = ","),
FUN = function(x) paste(head(x, -1), collapse = ","))
#[1] "Blue, Tall, leather" "Green, Medium" "Pink" "Yellow"
,除了最后一个用逗号分隔的内容
@Bean
public WebClient webClient(ClientRegistrationRepository clientRegistrationRepository , OAuth2AuthorizedClientRepository authorizedClientRepository) {
ServletOAuth2AuthorizedClientExchangeFilterFunction oauth =
new ServletOAuth2AuthorizedClientExchangeFilterFunction (clientRegistrationRepository , authorizedClientRepository);
return WebClient.builder().apply(oauth.oauth2Configuration()).build();
}
答案 2 :(得分:0)
具有sub
且与,
匹配的选项,后跟零个或多个不是,
([^,]*
)的字符,直到结尾($
)并替换为空白(""
)
sub(",[^,]*$", "", v1)
#[1] "Blue, Tall, leather" "Green, Medium" "Pink" "Yellow"
或者使用trimws
(从R 3.6.0
开始)
trimws(v1, whitespace = ",[^,]*")
#[1] "Blue, Tall, leather" "Green, Medium" "Pink" "Yellow"
或者使用str_remove
中的stringr
library(stringr)
str_remove(v1, ",[^,]*$")
v1 <- c("Blue, Tall, leather, VA", "Green, Medium, VA*", "Pink, MD", "Yellow, MA")