我有一个数据框df,它的列很少,并且包含text。我想删除所有长度小于4个字符的元素。我期望的是Expected_df。可重现的示例如下。
df<-data.frame(client=c("My Name is abcdff","Name is not right","Bangalore is getting hoter","BBa wasa school topper"),serial_numer=c(1:4))
expected_df<-data.frame(client=c("Name abcdff","Name right","Bangalore getting hoter","wasa school topper"),serial_numer=c(1:4))
这就是我尝试解决的问题
df$client<-as.character(df$client)
df$client[nchar(df$client) > 3]
答案 0 :(得分:1)
我们可以分割字符串并计算单个单词中的字符数,并仅选择大于等于4的字符。
df$client <- sapply(strsplit(as.character(df$client), "\\s+"), function(x)
paste0(x[nchar(x) >= 4], collapse = " "))
df
# client serial_numer
#1 Name abcdff 1
#2 Name right 2
#3 Bangalore getting hoter 3
#4 wasa school topper 4