我想从文件中删除重复的行,但只删除与特定正则表达式匹配的重复行,并将所有其他重复项留在文件中。这是我现在拥有的:
unique_lines = File.readlines("Ops.Web.csproj").uniq do |line|
line[/^.*\sInclude=\".*\"\s\/\>$/]
end
File.open("Ops.Web.csproj", "w+") do |file|
unique_lines.each do |line|
file.puts line
end
end
这将正确地重复删除行,但只会将符合正则表达式的行添加回文件中。我需要将文件中的所有其他行保持不变。我知道我在这里错过了一些小事。想法?
答案 0 :(得分:4)
试试这个:
lines = File.readlines("input.txt")
out = File.open("output.txt", "w+")
seen = {}
lines.each do |line|
# check if we want this de-duplicated
if line =~ /Include/
if !seen[line]
out.puts line
seen[line] = true
end
else
out.puts line
end
end
out.close
演示:
➜ 12980122 cat input.txt
a
b
c
Include a
Include b
Include a
Include a
d
e
Include b
f
➜ 12980122 ruby exec.rb
➜ 12980122 cat output.txt
a
b
c
Include a
Include b
d
e
f