我在一个文件中输入了。我的意见是,
Index
chapter 1
Introduction to ruby
ruby basics
Installing ruby
executing ruby
chapter 2
Ruby class
Ruby object
Ruby method
Defining method
Calling method
chapter 3
Ruby variable
Local variable
Class variable
Global variable
Instance variable
chapter 4
.
.
.
chapter 1
,chapter 2
,3
和4
等是标题。我可能在每章中有n行作为章节。
我需要仔细阅读特定章节的细节。我需要它的所有部分。例如,如果我grep chapter 1
,则输出为
chapter 1
Introduction to ruby
ruby basics
Installing ruby
executing ruby
如何遍历下一行,并检查它?请帮帮我。
File.open 'test.txt' do |file|
chap_det=file.find { |line| line =~ /chapter 1:/ }
puts chap_det
end
答案 0 :(得分:5)
假设您已成功将内容读入input
字符串:
input = File.read('test.txt')
chapter = ->(n) { /chapter\s+#{n}.*?(?=\R\w)/im }
#⇒ #<Proc:0x00000002b2d7f0@(pry):59 (lambda)>
input[chapter.(2)]
#⇒ "chapter 2\n Ruby class\n (...skipped...) Calling method"
此处的正则表达式匹配所有内容,从chapter N
开始,以回车符/换行符(任何“换行符”)结尾,后跟“单词符号”。
puts input[chapter.(1)]
# Chapter 1
# Introduction to ruby
# ruby basics
# Installing ruby
# executing ruby
NB!以下评论中WiktorStribiżew提出的正则表达式有点快,因为它不涉及懒字点匹配:
chapter = ->(n) { /chapter\s+#{n}\b.*(?:\R\B.*)*/i }
证明:
input = %|Index
Chapter 1
Introduction to ruby
ruby basics
Installing ruby
executing ruby
chapter 2
Ruby class
Ruby object
Ruby method
Defining method
Calling method
chapter 3
Ruby variable
Local variable
Class variable
Global variable
Instance variable
Chapter 4
Introduction to ruby
ruby basics
Installing ruby
executing ruby
chapter 5
Ruby class
Ruby object
Ruby method
Defining method
Calling method
chapter 6
Ruby variable
Local variable
Class variable
Global variable
Instance variable
|
ch1 = ->(n) { /chapter\s+#{n}.*?(?=\R\w)/im }
ch2 = ->(n) { /chapter\s+#{n}\b.*(?:\R\B.*)*/i }
require 'benchmark'
n = 500000
Benchmark.bm(7) do |x|
x.report("1:") { n.times do input[ch1.(4)] end }
x.report("2:") { n.times do input[ch2.(4)] end }
end
#⇒ user system total real
# 1: 6.460000 0.000000 6.460000 ( 6.460074)
# 2: 6.010000 0.000000 6.010000 ( 6.010000)
答案 1 :(得分:1)
出于好奇:使用flip-flop
operation的解决方案:
▶ N = 2
▶ File.readlines('text.txt').select do |line|
▷ true if line[/chapter #{N}/i]..line[/chapter #{N+1}/i]
▷ end[0...-1].join $/
#⇒ "chapter 2\n (... skipped out ...) Calling method"
比正则表达式解决方案慢约3倍。
答案 2 :(得分:0)
您还可以使用以下代码:
chapter_lines = []
start = false
chapter_number = 1
File.open("test.txt", "r").each_line do |line|
start = true if line["chapter #{chapter_number}"]
start = false if line["chapter #{chapter_number+1}"]
chapter_lines << line.strip if start
end
puts chapter_lines.join("\n")
编辑:请注意这假设所有对章节的引用都是&#34;章节&#34;而不是&#34;章&#34;。有问题的是&#39;章&#39;一次和&#39;章&#39;别处。小资本和资本的差异c。
希望有所帮助:)
答案 3 :(得分:0)
这是一个常见问题,Ruby的slice_before
或slice_after
方法非常有用。使用slice_before
:
doc = <<EOT
Index
chapter 1
Introduction to ruby
ruby basics
Installing ruby
executing ruby
chapter 2
Ruby class
Ruby object
Ruby method
Defining method
Calling method
chapter 3
Ruby variable
Local variable
Class variable
Global variable
Instance variable
EOT
chapters = doc.lines.slice_before(/^chapter/).to_a
# => [["Index\n"], ["chapter 1\n", " Introduction to ruby\n", " ruby basics\n", " Installing ruby\n", " executing ruby\n"], ["chapter 2\n", " Ruby class\n", " Ruby object\n", " Ruby method\n", " Defining method\n", " Calling method\n"], ["chapter 3\n", " Ruby variable\n", " Local variable\n", " Class variable\n", " Global variable\n", " Instance variable\n"]]
chapters.shift
chapters[0] # => ["chapter 1\n", " Introduction to ruby\n", " ruby basics\n", " Installing ruby\n", " executing ruby\n"]
chapters.shift
用于删除导致每章数组的第一个元素,按顺序编制索引。
从那里很容易恢复整个&#34;章节&#34;如果需要,可以使用join
内容,但由于这些行已经是数组元素,因此您可能希望将它们保持原样:
chapters[0].join # => "chapter 1\n Introduction to ruby\n ruby basics\n Installing ruby\n executing ruby\n"
由于您正在从文件中读取文件,只要文件安全地放入内存,您就可以使用File.readlines('file_to_read')
将其读取并将其转换为数组,然后您可以将其用于{ {1}}。