Question

我有一列包含值字符串，如下所示

a=["iam best in the world" "you are awesome" ,"Iam Good"]

我需要检查字符串中每个单词的哪些行是小写字母并由空格分隔。

我知道如何将它们转换为大写和空格分隔，但是我需要找出哪些行是小写和空格分隔。

我尝试使用

grepl("\\b([a-z])\\s([a-z])\\b",aa, perl =  TRUE)

Answer 1

我们可以尝试将grepl与\b[a-z]+(?:\\s+[a-z]+)*\b一起使用：

matches = a[grepl("\\b[a-z]+(?:\\s+[a-z]+)*\\b", a$some_col), ]
matches

  v1              some_col
1  1 iam best in the world
2  2       you are awesome

数据：

a <- data.frame(v1=c(1:3),
                some_col=c("iam best in the world", "you are awesome", "Iam Good"))

使用的正则表达式模式匹配一个全小写的单词，后跟一个空格和另一个全小写的单词，后者重复零次或更多次。请注意，我们在模式周围放置单词边界，以确保不会从以大写字母开头的单词中得到错误的标志匹配。

Answer 2

x <- c("iam best in the word ", "you are awesome", "Iam Good")

在这里，我做了一些不同的事情，首先我被空格隔开，然后检查是否是小写字母。因此，输出是每个短语的列表，只有小写单词按空格分隔。

sapply(strsplit(x, " "), function(x) {
  x[grepl("^[a-z]", x)]
})

Answer 3

我们可以将列转换为小写并与实际值进行比较。使用@Tim的数据

a[tolower(a$some_col) == a$some_col, ]

#  v1              some_col
#1  1 iam best in the world
#2  2       you are awesome

如果我们还需要检查空间，则可以使用grepl

添加另一个条件

a[tolower(a$some_col) == a$some_col & grepl("\\s+", a$some_col), ]

Answer 4

另一个想法是使用stri_trans_totitle包中的stringi，

a[!!!stringi::stri_trans_totitle(as.character(a$some_col)) == a$some_col,]

#  v1              some_col
#1  1 iam best in the world
#2  2       you are awesome

Answer 5

我们可以使用filter

library(dplyr)
a %>%
   filter(tolower(some_col) == some_col)
#   v1              some_col
#1  1 iam best in the world
#2  2       you are awesome

检查字符串中每个单词的哪些行大写并用空格分隔

5 个答案: