Ruby - 仅删除文件中的特定重复行

时间:2012-10-19 18:16:48

标签: ruby

我想从文件中删除重复的行,但只删除与特定正则表达式匹配的重复行,并将所有其他重复项留在文件中。这是我现在拥有的:

unique_lines = File.readlines("Ops.Web.csproj").uniq do |line|    
  line[/^.*\sInclude=\".*\"\s\/\>$/]
end

File.open("Ops.Web.csproj", "w+") do |file|
  unique_lines.each do |line|
    file.puts line
  end
end

这将正确地重复删除行,但只会将符合正则表达式的行添加回文件中。我需要将文件中的所有其他行保持不变。我知道我在这里错过了一些小事。想法?

1 个答案:

答案 0 :(得分:4)

试试这个:

lines = File.readlines("input.txt")
out = File.open("output.txt", "w+")
seen = {}

lines.each do |line|
  # check if we want this de-duplicated
  if line =~ /Include/
    if !seen[line]
      out.puts line
      seen[line] = true
    end
  else
    out.puts line
  end
end

out.close

演示:

➜  12980122  cat input.txt
a
b
c
Include a
Include b
Include a
Include a
d
e
Include b
f
➜  12980122  ruby exec.rb
➜  12980122  cat output.txt
a
b
c
Include a
Include b
d
e
f