我有一个像
这样的csv文件Identity,Keyword
23, The weather is perfect for good days football the players are healthy
45, 1 Locksmith services Locally Owned and Operated Fast response time Call Now
我想将关键字列中的字数减少到10
期望的输出
Identity,Keyword
23, The weather is perfect for good days football the players
45, 1 Locksmith services Locally Owned and Operated Fast response time
我正在使用代码
keyword <- sapply(record$Keyword,function(x) gsub("^((\\w+\\W+){9}\\w+).*","\\1",x))
第二个身份并没有将单词数减少到10。出了什么问题?任何帮助表示赞赏?
答案 0 :(得分:4)
为了其他人给出不同答案的好处,我已经以复制/粘贴格式添加了您的数据...
# The data....
df <- read.table( text = "Identity,Keyword
23, \'The weather is perfect for good days football the players are healthy\'
45, \'1 Locksmith services Locally Owned and Operated Fast response time Call Now\'" , header = TRUE , sep = "," , stringsAsFactors = FALSE)
# Strip out leading and trailing spaces (which were a problem for me)
df$Keyword <- gsub( "^ +| +$" , "" , df$Keyword )
# Split words on spaces, and select the first 10 elements of each
ll <- lapply( strsplit( df$Keyword , " " ) , `[` , 1:10 )
# Collapse to a single 10 word string and add to the orginal data.frame
df$Short <- sapply( ll , paste , collapse = " " )
# Identity Keyword Short
#1 23 The weather is perfect for good days football the players are healthy The weather is perfect for good days football the players
#2 45 1 Locksmith services Locally Owned and Operated Fast response time Call Now 1 Locksmith services Locally Owned and Operated Fast response time