我试图为一个数组中的每个单词搜索一个段落,然后输出一个只包含可以找到的单词的新数组。
但到目前为止,我还无法获得所需的输出格式。
ID Date Var
01 21/01/2016 1
01 22/01/2016 1
02 13/05/2016 2
02 14/05/2016 2
03 08/06/2016 4
03 08/06/2016 4
目前我得到的输出是打印单词的垂直列表。
paragraph = "Japan is a stratovolcanic archipelago of 6,852 islands.
The four largest are Honshu, Hokkaido, Kyushu and Shikoku, which make up about ninety-seven percent of Japan's land area.
The country is divided into 47 prefectures in eight regions."
words_to_find = %w[ Japan archipelago fishing country ]
words_found = []
words_to_find.each do |w|
paragraph.match(/#{w}/) ? words_found << w : nil
end
puts words_found
但我想要的是Japan
archipelago
country
。
我没有多少经验来匹配段落中的文字,我不确定我在这里做错了什么。谁能提供一些指导?
答案 0 :(得分:0)
这是因为您使用puts
来打印数组的元素。将"\n"
附加到每个元素的末尾&#34; word&#34;:
#!/usr/bin/env ruby
def run_me
paragraph = "Japan is a stratovolcanic archipelago of 6,852 islands.
the four largest are Honshu, Hokkaido, Kyushu and Shikoku, which make up about ninety-seven percent of Japan's land area.
the country is divided into 47 prefectures in eight regions."
words_to_find = %w[ Japan archipelago fishing country ]
find_words_from_a_text_file paragraph , words_to_find
end
def find_words_from_a_text_file( paragraph , *words_to_find )
words_found = []
words_to_find.each do |w|
paragraph.match(/#{w}/) ? words_found << w : nil
end
# print array with enum .
words_found.each { |x| puts "with enum and puts : : #{x}" }
# or just use "print , which does not add anew line"
print "with print :"; print words_found "\n"
# or with p
p words_found
end
run_me
输出:
za:ruby_dir za$ ./fooscript.rb
with enum and puts : : ["Japan", "archipelago", "fishing", "country"]
with print :[["Japan", "archipelago", "fishing", "country"]]
答案 1 :(得分:0)
以下是两种方法。两者都是无关紧要的。
使用正则表达式
r = /
\b # Match a word break
#{ Regexp.union(words_to_find) } # Match any word in words_to_find
\b # Match a word break
/xi # Free-spacing regex definition mode (x)
# and case-indifferent (i)
#=> /
# \b # Match a word break
# (?-mix:Japan|archipelago|fishing|country) # Match any word in words_to_find
# \b # Match a word break
# /ix # Free-spacing regex definition mode (x)
# and case-indifferent (i)
paragraph.scan(r).uniq(&:itself)
#=> ["Japan", "archipelago", "country"]
相交两个数组
words_to_find_hash = words_to_find.each_with_object({}) { |w,h| h[w.downcase] = w }
#=> {"japan"=>"Japan", "archipelago"=>"archipelago", "fishing"=>"fishing",
"country"=>"country"}
words_to_find_hash.values_at(*paragraph.delete(".;:,?'").
downcase.
split.
uniq & words_to_find_hash.keys)
#=> ["Japan", "archipelago", "country"]