我正在使用Rails 3.1和PostgreSQL 8.4。假设我想/需要使用GUID主键。一个潜在的缺点是索引碎片。在MS SQL中,推荐的解决方案是使用特殊的顺序GUID。一个approach到顺序GUID是COMBination GUID,它将6字节时间戳替换为GUID末尾的MAC地址部分。这有一些主流采用:COMB在NHibernate(NHibernate/Id/GuidCombGenerator.cs)本地可用。
我想我已经想出如何在Rails中创建COMB GUID(在UUIDTools 2.1.2 gem的帮助下),但是它留下了一些悬而未决的问题:
感谢您的想法。
create_contacts.rb
迁移
class CreateContacts < ActiveRecord::Migration
def up
create_table :contacts, :id => false do |t|
t.column :id, :uuid, :null => false # manually create :id with underlying DB type UUID
t.string :first_name
t.string :last_name
t.string :email
t.timestamps
end
execute "ALTER TABLE contacts ADD PRIMARY KEY (id);"
end
# Can't use reversible migration because it will try to run 'execute' again
def down
drop_table :contacts # also drops primary key
end
end
/app/models/contact.rb
class Contact < ActiveRecord::Base
require 'uuid_helper' #rails 3 does not autoload from lib/*
include UUIDHelper
set_primary_key :id
end
/lib/uuid_tools.rb
require 'uuidtools'
module UUIDHelper
def self.included(base)
base.class_eval do
include InstanceMethods
attr_readonly :id # writable only on a new record
before_create :set_uuid
end
end
module InstanceMethods
private
def set_uuid
# MS SQL syntax: CAST(CAST(NEWID() AS BINARY(10)) + CAST(GETDATE() AS BINARY(6)) AS UNIQUEIDENTIFIER)
# Get current Time object
utc_timestamp = Time.now.utc
# Convert to integer with milliseconds: (Seconds since Epoch * 1000) + (6-digit microsecond fraction / 1000)
utc_timestamp_with_ms_int = (utc_timestamp.tv_sec * 1000) + (utc_timestamp.tv_usec / 1000)
# Format as hex, minimum of 12 digits, with leading zero. Note that 12 hex digits handles to year 10889 (*).
utc_timestamp_with_ms_hexstring = "%012x" % utc_timestamp_with_ms_int
# If we supply UUIDTOOLS with a MAC address, it will use that rather than retrieving from system.
# Use a regular expression to split into array, then insert ":" characters so it "looks" like a MAC address.
UUIDTools::UUID.mac_address = (utc_timestamp_with_ms_hexstring.scan /.{2}/).join(":")
# Generate Version 1 UUID (see RFC 4122).
comb_guid = UUIDTools::UUID.timestamp_create().to_s
# Assign generted COMBination GUID to .id
self.id = comb_guid
# (*) A note on maximum time handled by 6-byte timestamp that includes milliseconds:
# If utc_timestamp_with_ms_hexstring = "FFFFFFFFFFFF" (12 F's), then
# Time.at(Float(utc_timestamp_with_ms_hexstring.hex)/1000).utc.iso8601(10) = "10889-08-02T05:31:50.6550292968Z".
end
end
end
答案 0 :(得分:4)
- 当PRIMARY KEY类型为UUID时,PostgreSQL会遭受索引碎片吗?
是的,这是可以预料的。但是如果你打算使用不会发生的COMB策略。行将始终按顺序排列(这不完全正确,但请耐心等待)。
此外,本机pgsql UUID与VARCHAR之间的性能为not all that different。另一点需要考虑。
- 如果GUID的低6字节是连续的,是否可以避免碎片?
在我的测试中,我发现UUID1(RFC 4122)是顺序的,在生成的uuid中已经添加了时间戳。但是,在最后6个字节中添加时间戳将确保排序。这就是我所做的,因为显然已经存在的时间戳并不能保证订单。更多关于COMB here
的信息
- 下面实现的COMB GUID是否是在Rails中创建顺序GUID的可接受,可靠的方法?
我没有使用rails,但我会告诉你我是如何在django中做到的:
import uuid, time
def uuid1_comb(obj):
return uuid.uuid1(node=int(time.time() * 1000))
其中node
是标识硬件地址的48位正整数。
关于您的实现,使用uuid的一个主要优点是您可以在数据库外安全地生成它们,因此,使用帮助程序类是一种有效的方法。您可以随时使用外部服务来生成snowflake这样的uuid,但此时可能会过早优化。