在Ruby中提取部分字符串

时间:2011-12-02 02:52:26

标签: ruby

所以我给了一个包含以下内容的文本文件:

Blade Runner (1982) [117 min] 
Full Metal Jacket (1987) [116 min] 
Monty Python and the Holy Grail (1975) [91 min] 
The Godfather (1972) [175 min]

并且必须把它变成这个:

Movie name: Blade Runner  
Movie release year: 1982 
Movie length (in mins): 117 

Movie name: Full Metal Jacket  
Movie release year: 1987 
Movie length (in mins): 116 

Movie name: Monty Python and the Holy Grail  
Movie release year: 1975 
Movie length (in mins): 91 

Movie name: The Godfather  
Movie release year: 1972 
Movie length (in mins): 175

首先我迭代每一行,然后我想我应该迭代字符串的每一部分,但那就是我被卡住的地方,我该怎么做?我使用正则表达式吗?如何保存与正则表达式匹配的特定字符串?

这是代码的当前shell,它将这三个部分存储到变量中,这些变量用于初始化一个以所需格式打印to_s方法的影片类。

我知道这在很多方面都不对,但这就是我寻求帮助的原因。变量= / regex /是变量被赋予正则表达式捕获的东西的行,以及当正则表达式匹配时/ regex /的变量。

class Movie
    def initialize (name, year, length) # constructor
        @name = name
        @year = year
        @length = length
    end

    def to_s    # returns string representation of object
        return "Movie Name: " + @name
            + "\nMovie release year: "
            + @year + "\nMovie Length (in min): "
            + @length + "\n"
    end
end

$movies = []
File.open("movies.txt").each do |line|
  if matches = /(.*)? \((\d+).*?(\d+)/.match(line)
    $movies << Movie.new(matches[1], matches[2], matches[3])
  end
end


for $movie in $movies do #what u got here is not index but the element in the array
    print $movie.to_s
end

编辑:

固定版本的代码,但最后的打印循环不起作用。

Edit2:和nownit一样。谢谢PeterPeiGuo!

3 个答案:

答案 0 :(得分:2)

m = /(.*)? \((\d+).*?(\d+)/.match("Blade Runner (1982) [117 min]")

答案 1 :(得分:1)

您可以这样做:

$movies = []
File.open("movies.txt").each do |line|
  if matches = /^(.*)\((\d+)\) \[(\d+)\smin\]/.match(line)
    $movies << Movie.new(matches[1], matches[2], matches[3])
  end
end

答案 2 :(得分:1)

# create string containing list of movies (numerous ways to load this data)
movie = <<-MOV
Blade Runner (1982) [117 min] 
Full Metal Jacket (1987) [116 min] 
Monty Python and the Holy Grail (1975) [91 min] 
The Godfather (1972) [175 min]
<<-MOV

# split movies into lines, then iterate over each line and do some regex
# to extract relavent data (name, year, runtime)
data = movies.split("\n").map do |s| 
  s.scan(/([\w\s]+)\ \((\d+)\)\ \[(\d+)\ min\]/).flatten }
end
# => [['Blade Runner', '1982', '117'], ... ]

# iterate over data, output in desired format.
data.each do |data| 
  puts "Movie name: #{data[0]}\nMovie release year: #{data[1]}\nMovie length: (in mins): #{data[2]}\n\n" }
end
# outputs in format you specified