Ruby脚本不打印重复行

时间:2013-10-29 00:47:15

标签: ruby

我可以在此脚本中添加哪些内容,使其无法从txt文件中打印出重复的行

脚本是

class TestKeyword

        file = File.new("test.txt", "r")
    while (line = file.gets)
        if line['MAY_DAY']
            date = line[/\w+ +\d+ +\d+:\d+:\d+/]
            puts "#{date}"

        end        
    end
end

这是测试文件:

Oct 15 12:54:01 WHERE IS THE LOVIN MAY_DAY
Oct 16 23:15:44 WHAT THE HECK CAN I DO ABOUT IT HUMP_DAY 
Oct 16 14:16:09 I LOVE MY BABY GIRL MAY_DAY 
Oct 16 08:25:18 CAN WAIT UNTIL MY BABY RECOVERS CRYSTAL_WIFE 
Oct 18 17:48:38 I HOPE HE STOP MESSING WITH THESE FOOLISH CHILDREN TONY_SMITH 
Oct 19 05:17:58 GAME TIME GO HEAD AND GET ME MAY_DAY 
Oct 20 10:23:33 GAMESTOP IS WHERE ITS AT GAME_DAY
Oct 21 03:54:27 WHAT IS GOING ON WITH MY LUNCH HUNGRY_MAN
Oct 15 12:54:01 WHERE IS THE LOVIN MAY_DAY
Oct 16 23:15:44 WHAT THE HECK CAN I DO ABOUT IT HUMP_DAY 
Oct 16 14:16:09 I LOVE MY BABY GIRL MAY_DAY 
Oct 16 08:25:18 CAN WAIT UNTIL MY BABY RECOVERS CRYSTAL_WIFE 
Oct 18 17:48:38 I HOPE HE STOP MESSING WITH THESE FOOLISH CHILDREN TONY_SMITH 
Oct 19 05:17:58 GAME TIME GO HEAD AND GET ME MAY_DAY 
Oct 20 10:23:33 GAMESTOP IS WHERE ITS AT GAME_DAY
Oct 21 03:54:27 WHAT IS GOING ON WITH MY LUNCH HUNGRY_MAN

目前,当我执行脚本时,我得到以下内容(这是具有关键字“MAY_DAY”的行的日期和时间:

1: Oct 15 12:54:01
1: Oct 16 14:16:09
1: Oct 19 05:17:58
1: Oct 15 12:54:01
1: Oct 16 14:16:09
1: Oct 19 05:17:58

我需要的输出是:

1: Oct 15 12:54:01
1: Oct 16 14:16:09
1: Oct 19 05:17:58

没有重复项

2 个答案:

答案 0 :(得分:1)

你将不得不记住你已用一个小数组输出的行,例如

class TestKeyword
  found = []
  file = File.new("test.txt", "r")
  while (line = file.gets)
    if line['MAY_DAY']
      date = line[/\w+ +\d+ +\d+:\d+:\d+/]
      if !found.include? date
        found << date 
        puts "#{counter}: #{date}"
      end
    end        
  end
end

看看我在那里做什么?如果日期不在数组中,我们将其添加到它并输出日期。否则我们会忽略它。

编辑:如果你想要更高级,你可以使用Set而不是数组。集合旨在快速查找唯一元素。如果你想问的唯一问题是'这个元素是否在这个集合中?'并且您不关心订单,请使用Set。要做到这一点,只需更改此行:

found = []

对此:

found = Set.new

答案 1 :(得分:1)

如果文件不大,则会打印出匹配的唯一行:

file.readlines.select{|l| l.include? "MAY_DAY"}.uniq

它不适用于计数器,但很容易添加。