使用select而不是gsub来避免Ruby中的多个正则表达式求值

时间:2012-04-24 06:48:36

标签: ruby regex gsub

这是一个输出,需要多个正则表达式评估,但得到我想要做的事情(删除除文本之外的所有内容)。

words = IO.read("file.txt").
gsub(/\s/, ""). # delete white spaces
gsub(".",""). # delete periods
gsub(",",""). # delete commas
gsub("?","") # delete Q marks
puts words
# output
#      WheninthecourseofhumaneventsitbecomesnecessaryIwanttobelieveyoureallyIdobutwhoamItoblameWhenthefactsarecountedthenumberswillbereportedLotsoflaughsCharlieIthinkIheardthatonetentimesbefore

看看这篇文章 - Ruby gsub : is there a better way - 我想我会尝试做一个匹配来完成相同的结果,而无需多个正则表达式评估。但我没有得到相同的输出。

words = IO.read("file.txt").
match(/(\w*)+/)
puts words
# output - this only gets the first word
# When

这只得到第一句话:

words = IO.read("file.txt").
match(/(...*)+/)
puts words

# output - this only gets the first sentence
# When in the course of human events it becomes necessary.

关于在匹配而不是gsub上获取相同输出(包括剥离空格和非单词字符)的任何建议?

2 个答案:

答案 0 :(得分:1)

您可以在一个gsub操作中执行您想要的操作:

s = 'When in the course of human events it becomes necessary.'
s.gsub /[\s.,?]/, ''
# => "Wheninthecourseofhumaneventsitbecomesnecessary"

答案 1 :(得分:0)

您不需要对此进行多次正则表达式评估。

str = "# output - this only gets the first sentence
# When in the course of human events it becomes necessary."
p str.gsub(/\W/, "")
#=>"outputthisonlygetsthefirstsentenceWheninthecourseofhumaneventsitbecomesnecessary"