我的代码在这里
str= "In 2004, Obama received national attention during his campaign to represent Illinois in the United States Senate"
arr =str.scan(/\S+(?:\s+\S+)?/)
it gives
arr=["In 2004,", "Obama received", "national attention", "during his", "campaign to", "represent Illinois", "in the", "United States", "Senate"]
fresh_arr=[]
arr.each do |el|
if !arr.match(/is|am|are|this|his/)
fresh_arr << el
end
end
现在我要删除包含(is,am,are,this,his)类型字符串的元素 然后结果像这样
arr=["Obama received", "national attention","represent Illinois","United States", "Senate"]
我有非常大的数据,我可以花6秒时间以任何其他方式执行此操作
答案 0 :(得分:2)
简单的方法。但我不知道表现。因为map
仍然运行您正在运行的循环。
arr.map{|x| x unless x =~ /\b(in|am|are|his|this)\b/i}.compact
基准:
> my_bm(500000){arr.map{|x| x unless x =~ /\b(in|am|are|his|this)\b/i}.compact}
user system total real
7.430000 0.000000 7.430000 ( 7.451064)
=> nil
> my_bm(500000){arr.reject! { |e| e =~ /\b(in|am|are|his|this)\b/i }}
user system total real
4.620000 0.000000 4.620000 ( 4.623782)
> my_bm(5000000){arr.map{|x| x unless x =~ /\b(in|am|are|his|this)\b/i}.compact}
user system total real
50.790000 0.010000 50.800000 ( 50.840533)
> my_bm(5000000){arr.reject! { |e| e =~ /\b(in|am|are|his|this)\b/i }}
user system total real
46.140000 0.010000 46.150000 ( 46.198752)
=> nil