我正在尝试将来自Internet的开放数据XML文件解析为我的rails数据库。以下是应该解析它的代码:
require 'rake'
require 'open-uri'
namespace :db do
task :xml_parser => :environment do
doc = Nokogiri::XML(open("https://dl.dropboxusercontent.com/u/21695507/openplaques/gb_20151004.xml"))
doc.css('plaque').each do |node|
children = node.children
Plaque.create(
:title => children.css('title').inner_text,
:subject => children.css('subjects').inner_text,
:colour => children.css('colour').inner_text,
:inscription => children.css('inscription raw').inner_text,
:latitude => children.css('geo')["latitude"].text,
:longitude => children.css('geo')["longitude"].text,
:address => children.css('address').inner_text,
:organisation => children.css('organisation').inner_text,
:date_erected => children.css('date_erected').inner_text
)
end
end
end
这是架构:
create_table "plaques", force: :cascade do |t|
t.string "title"
t.string "subject"
t.string "colour"
t.text "inscription"
t.string "latitude"
t.string "longitude"
t.text "address"
t.text "organisation"
t.string "date_erected"
t.datetime "created_at", null: false
t.datetime "updated_at", null: false
end
我运行rake db:xml_parser,我收到以下错误:
TypeError: no implicit conversion of String into Integer
以下是我尝试解析的XML文件中的示例。
<plaque uri="http://openplaques.org/plaques/4856" machine_tag="openplaques:id=4856" created_at="2010-11-26T13:58:23+00:00" updated_at="2011-06-28T17:00:01+01:00">
<title>Arthur Linton blue plaque</title>
<subjects>Arthur Linton</subjects>
<colour>blue</colour>
<inscription>
<raw>
World Champion Cyclist 1895 lived here Arthur Linton 1872-1896
</raw>
<linked>
World Champion Cyclist 1895 lived here <a href="/people/2934">Arthur Linton</a> 1872-1896
</linked>
</inscription>
<geo reference_system="WGS84" latitude="51.7005" longitude="-3.4251" is_accurate="true" />
<location>
<address>Sheppard's Pharmacy, 218 Cardiff Road</address>
<locality uri="http://0.0.0.0:3000/places/gb/areas/aberaman/plaques">Aberaman</locality>
<country uri="http://0.0.0.0:3000/places/gb">United Kingdom</country>
</location>
<organisation uri="http://0.0.0.0:3000/organisations/rhondda_cynon_taf_council">Rhondda Cynon Taf Council</organisation>
<date_erected>2009-10-26</date_erected>
<person uri="http://0.0.0.0:3000/people/2934">Arthur Linton</person>
</plaque>
答案 0 :(得分:0)
我认为错误不在架构或Place.create(...)
内容中。我认为这是你从Nokogiri那里获取数据的方式。 some_node.css("some-selector")
将返回一组符合条件的多个节点。可能会发生节点数为1(或0),因此.inner_text
调用有效。
我相信您的问题出在获取纬度和经度的两行:
:latitude => children.css('geo')["latitude"].text,
:longitude => children.css('geo')["longitude"].text,
children.css('geo')
将返回一组节点,在这种情况下,它类似于单个元素数组[geo]
。但是,您对["latitude"]
的调用就像向数组询问latitude
- 元素......这没有任何意义。具体地,
a = ["a", "b", "c", "d"]
a[1] # => "b"
a["longitude"] # => what?!?, or TypeError: no implicit conversion of String into Integer
我要做的是获取lat和long值,首先从css("geo")
搜索中取出第一个元素。然后,调用属性以获取属性哈希。 然后,您可以通过"latitude"
和"longitude"
的字符串获取,最后,您需要调用.value
来获取文本值。完整的,
:latitude => children.css('geo').first.attributes["latitude"].value,
:longitude => children.css('geo').first.attributes["longitude"].value,
答案 1 :(得分:0)
有一个更简单的解决方案,它完美无缺!
require 'rake'
require 'open-uri'
namespace :db do
task :xml_parser => :environment do
doc = Nokogiri::XML(open("https://dl.dropboxusercontent.com/u/21695507/openplaques/gb_20151004.xml"))
doc.css('plaque').each do |node|
title = node.xpath("plaque").text,
subject = node.xpath("plaque").text,
colour = node.xpath("plaque").text,
inscription = node.xpath("plaque").text,
latitude = node.xpath("plaque").text,
longitude = node.xpath("plaque").text,
address = node.xpath("plaque").text,
organisation = node.xpath("plaque").text,
date_erected = node.xpath("plaque").text
Plaque.create(:title => title, :subject => subject, :colour => colour, :inscription => inscription, :latitude => latitude, :longitude => longitude, :address => address, :organisation => organisation, :date_erected => date_erected)
end
end
end