使用Ruby CSV创建Rails记录,其中字符串字段不可查询

时间:2011-10-13 00:35:30

标签: ruby ruby-on-rails-3.1 fastercsv

我正在尝试从CSV文件将种子数据加载到我的Rails应用程序中。我最初安装了fastercsv gem,但是发现从ruby 1.9开始不推荐使用updatedcsv而支持CSV库。所以我在收到一个非常有用的错误告诉我切换后切换到CSV。

然而,现在,我得到了一个最奇怪的现象,当我加载我的数据时,一切看起来都很正常,但我似乎无法查询字符串字段。字符串字段由看似正确的字符串填充,但我无法访问它们。我可以查询任何数字字段,结果将返回,但不返回字符串字段。我尝试使用分隔符来报价,但无济于事。我甚至从我的csv文件中删除了所有引号,但仍然无法查询字符串字段。下面是我的代码,以及Rails控制台的一些示例查询和返回。

# seeds.rb
# ================

require 'csv'

directory = "db/init_data/"

file_name = "players.seed"
path_to_file = directory + file_name
puts 'Loading Player records'
# Pre-load All Player records
n=0
CSV.foreach(path_to_file) do |row|
  Player.create! :first_name => row[1], :last_name => row[2], :position_id => row[5], :weight => row[6], :height => row[7], :year => row[8], :home_state => row[9], :home_town => row[10], :home_country => row[11], :high_school_id => row[12], :name => row[13]      
n=n+1
end

以下是我的种子文件中的前两个记录。

# players.seed
"1","Allerik","Freeman","2011-10-11 22:21:21.230247","2011-10-11 22:21:21.230247","2","210","76","2013","NC","Charlotte","USA","1","Allerik Freeman"
"2","Kasey","Hill","2011-10-11 22:21:21.262409","2011-10-11 22:21:21.262409","1","170","73","2013","FL","Eustis","USA","2","Kasey Hill"

这是我进入rails控制台时的结果。如果我想查询一年的数字,它可以正常工作。

ruby-1.9.2-p290 :002 > Player.find_all_by_year(2013)
  Player Load (0.7ms)  SELECT "players".* FROM "players" WHERE "players"."year" = 2013
 => [#<Player id: 1, first_name: "Allerik", last_name: "Freeman", created_at: "2011-10-12 20:52:16", updated_at: "2011-10-12 20:52:16", position_id: 2, weight: 210, height: 76, year: 2013, home_state: "NC", home_town: "Charlotte", home_country: "USA", high_school_id: 1, name: "Allerik Freeman">, #<Player id: 2, first_name: "Kasey", last_name: "Hill", created_at: "2011-10-12 20:52:16", updated_at: "2011-10-12 20:52:16", position_id: 1, weight: 170, height: 72, year: 2013, home_state: "FL", home_town: "Eustis", home_country: "USA", high_school_id: 2, name: "Kasey Hill">]

但是,如果我尝试通过说出姓氏进行查询,我什么也得不到,即使它告诉我姓氏出现在上一个查询中。

ruby-1.9.2-p290 :004 > Player.find_all_by_last_name("Freeman")
  Player Load (0.3ms)  SELECT "players".* FROM "players" WHERE "players"."last_name" = 'Freeman'
 => [] 

我可以让它工作的唯一方法是使用哈希变量表示法将它放入一组额外的双引号(转义),它将我的所有字符串记录都引入数据库中的引号,然后我使用了删除命令删除引号。

  n=0
  CSV.foreach(path_to_file) do |row|
    Player.create! :first_name => "\"#{row[1]}\"", :last_name => "\"#{row[2]}\"", :position_id => row[5], :weight => row[6], :height => row[7], :year => row[8], :home_state => "\"#{row[9]}\"", :home_town => "\"#{row[10]}\"", :home_country => "\"#{row[11]}\"", :high_school_id => row[12], :name => "\"#{row[13]}\""      
    n=n+1
  end
  puts "There\'s too many playas to hate, we just loaded #{n} of \'em"

  @players = Player.all
  @players.each do |player|
    fname = player.first_name
    player.first_name = fname.delete("\"")
    lname = player.last_name
    player.last_name = lname.delete("\"")
    pcity = player.home_town
    player.home_town = pcity.delete("\"")
    pst = player.home_state
    player.home_state = pst.delete("\"")
    pcountry = player.home_country
    player.home_country = pcountry.delete("\"")
    pname = player.name
    player.name = pname.delete("\"")
    player.save!
  end  

然后我可以查询字符串数据。

ruby-1.9.2-p290 :005 > Player.find_all_by_last_name("Freeman")
  Player Load (0.6ms)  SELECT "players".* FROM "players" WHERE "players"."last_name" = 'Freeman'
 => [#<Player id: 1, first_name: "Allerik", last_name: "Freeman", created_at: "2011-10-12 20:52:16", updated_at: "2011-10-12 20:52:16", position_id: 2, weight: 210, height: 76, year: 2013, home_state: "NC", home_town: "Charlotte", home_country: "USA", high_school_id: 1, name: "Allerik Freeman">, #<Player id: 59, first_name: "Austin", last_name: "Freeman", created_at: "2011-10-12 20:55:16", updated_at: "2011-10-12 20:55:16", position_id: 2, weight: 210, height: 76, year: 2007, home_state: "MD", home_town: "Hyattsville", home_country: "USA", high_school_id: nil, name: "Austin Freeman">] 

显然这不是一个首选的方法,因为它加倍了我的加载时间,但我老实说我的智慧结束了。

非常感谢任何帮助。

根据我的要求,我添加了schema.rb

# schema.rb
# ===================
# encoding: UTF-8
# ...

ActiveRecord::Schema.define(:version => 20111007214728) do

#...

  create_table "players", :force => true do |t|
    t.string   "first_name"
    t.string   "last_name"
    t.datetime "created_at"
    t.datetime "updated_at"
    t.integer  "position_id"
    t.integer  "weight"
    t.integer  "height"
    t.integer  "year"
    t.string   "home_state"
    t.string   "home_town"
    t.string   "home_country"
    t.integer  "high_school_id"
    t.string   "name"
  end

# ...

end

以下是我的SQLite数据库浏览器按要求查看的数据库的屏幕截图。

View of Player Table: Looks normal right?

No Rows Returned when querying a string field

看起来有a similar issue here in the ruby forums,它可能与编码有关,但我需要对编码进行更多研究才能弄明白。

3 个答案:

答案 0 :(得分:2)

尝试在players.seed的最顶部添加# encoding: UTF-8

# encoding: UTF-8
# players.seed
...

答案 1 :(得分:0)

请检查以下内容:

  • 数据库中字符串的编码,例如它可能应该是UTF-8

    你是如何创建数据库的?在MySQL中你应该使用这样的东西:

    创建数据库DatabaseName DEFAULT CHARACTER SET utf8;

  • 解析/读取时从CSV文件中获取的字符串的编码

请参阅:http://www.ruby-doc.org/stdlib-1.9.2/libdoc/csv/rdoc/CSV.html

您还可以尝试直接读取CSV文件,以便在从文件中读取字符串时检查字符串的编码。


编辑:

有些消息称,SQLite仅支持ISO-8859-1编码,如果在编译时指定,则只支持UTF-8,这可能是个问题。 您使用的是哪个版本的SQLite? http://refdb.sourceforge.net/manual/ch08s09.html

另一方面,这个消息来源说SQLite 3.x使用UTF-8 http://www.sqlite.org/version3.html

答案 2 :(得分:0)

尝试将“#coding:utf-8”添加到seeds.rb的第一行

# coding: utf-8
# seeds.rb
# ================
...