Ruby Array:字符串为整数

时间:2016-10-04 07:54:43

标签: arrays ruby regex

我是Ruby的新手。我有一系列数组,每个数组有两个字符串:

["[[\"Wayfair \", \"57\"]]", "[[\"Move24 \", \"26\"]]",
  "[[\"GetYourGuide \", \"25\"]]", "[[\"Visual Meta \", \"22\"]]",
  "[[\"FinLeap \", \"20\"]]", "[[\"Movinga \", \"20\"]]",
  "[[\"DCMN \", \"19\"]]", ...

我正在尝试将每个数组的数字转换为整数,但我得到的东西比我期望的还要多:

companies = companies.map do |company|
  c = company[0].scan(/(.+)\((\d+)\)/).inspect
  [c[0], c[1].to_i]
end

提出:

["[", 0], ["[", 0], ["[", 0], ["[", 0], ["[", 0], ["[", 0],
  ["[", 0], ["[", 0], ["[", 0], ["[", 0], ["[", 0]]

我期待:

 ["Wayfair", 57],  ["Move24", 26], ["GetYourGuide", 25], ...

请帮帮忙?

完整代码:

require 'net/http'
require 'uri'

uri = URI('http://berlinstartupjobs.com/') #URI takes just one url
req = Net::HTTP::Get.new(uri) #get in URI
req['User-Agent'] = 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36   (KHTML, like Gecko) Chrome/53.0.2785.116 Safari/537.36' #use this header


res = Net::HTTP.start(uri.hostname, uri.port) {|http| http.request(req)} # URI documentation

puts res.code #status code

puts res.body

puts res.body.scan('<a href="http://berlinstartupjobs.com/companies/') #scan in the body of the document files that match a href=...

puts res.body.scan(/<a href="http:\/\/berlinstartupjobs\.com\/companies\/[^\s]+ class="tag-link">(.*)<\/a>/) #scan

companies = res.body.scan(/<a href="http:\/\/berlinstartupjobs\.com\/companies\/[^\s]+ class="tag-link">(.*)<\/a>/)


companies = companies.map do |company|
  c = company[0].scan(/(.+)\((\d+)\)/).inspect
  [c[0], c[1].to_i]
end # do ... end = { }

  puts companies.inspect

3 个答案:

答案 0 :(得分:1)

您可以使用Enumerable#map&amp;使用JSON.parse解析每个元素:

require 'json'

companies.map { |elem| key, val = JSON.parse(elem).flatten; [k.strip, v.to_i] }

您也可以使用JSON.parse代替eval,但使用eval被视为不良做法。

答案 1 :(得分:1)

arr = ["[[\"Wayfair \", \"57\"]]", "[[\"Move24 \", \"26\"]]"]
result = arr.collect{|e| JSON.parse(e)[0].map{|name, value| [name.strip, value.to_i]}}

OUTPUT:
[[Wayfair, 57], [Move24", 26]]

答案 2 :(得分:1)

你的代码基本上没问题。只需将.inspect放在最后。它返回一个字符串,而不是数组。

# this is what you get from the scraping.
companies = [["Wayfair (57)"], ["Move24 (26)"], ["GetYourGuide (25)"]]

companies = companies.flatten.map do |company|
  c = company.scan(/(.+)\((\d+)\)/).flatten
  [c[0], c[1].to_i]
end

p companies
# >> [["Wayfair ", 57], ["Move24 ", 26], ["GetYourGuide ", 25], ...]